Jérôme François


Passive DNS analysis may reveal important security related information such as domains used for phishing, fast flux, double-flux but also to know the type of activity behind a domain (CDN, user-tracking, etc). In parallel, actively probing DNS helps to extend knowledge about a certain domain. We have developped a new tool based on logic behind name assignation based on natural language modeling an semantic extensions. The tool is available here.


Botnet models

I proposed new models to evaluate global performances of botnets regarding the characteristics of the topology used (centralized or P2P). It allows to evaluate the robustness of a botnet towards network connectivity problems or attacks. The targeted application is the network management. In fact, network management is faced with scalability problems which can be resolved with the usage of a botnet architecture.


The botnets are still one major threat of Internet nowadays. Detecting botnet in the core of Internet at the operator level is a potential option to counter them since operators have a wider view of the network. However, they cannot have a precise view of what happens on a machine such as we can have by using honeypots for example. Therefore, our goal is to combine such a global view with local view to detect entire botnets and not single instance of bots. This leads to the use of recent data mining algorithms.


Device fingerprinting aims at automatically infer the protocol stack of a remote device. It is similar to the name and version of software or hardware in use. Our methods are totally passive and does not need to interact with the remote device to fingerprint. We have investigated different methods:

  • syntactic fingerprinting: the way the message is constructed is relevant
  • behavioral fingerprinting: the way a device interacts with others (protocol state machines) is implementation dependent. This method needs only message types which can be efficiently recovered by a reverse engineering techniques.
  • structural fingerprinting: this method is very close to syntactic ones but does not consider the fine granularity of the syntax. It considers the information structure such as given by tshark dissectors.Our techniques use these different techniques with recent classification algorithms like Support Vector Machines.

The fingerprinting software are available online.

Cloud computing

This project has two directions:

  • leverage cloud computing facilities for innovative security applications which are more and more resources consuming
  • improve security within the cloud

Intrusion Detection

Research in intrusion detection is still needed as shown by experience and news of everyday. Since standards methods (like firewalls based on signatures) fails to detect zero-day threat, anomaly detection is more promising. One project is to use collaboration among routers for detecting abnormal flows. Another option is to gather all flow information from the network and apply data-mining algorithms.

Topology Optimization

Automatic discovery of network topology:

  • distributed probes on embedded Linux devices
  • remote probe management
  • geolocalized view
  • congestion detection

Ph.D. Thesis

Robustness and Identification of Communicating Applications

The growth of computer networks like the Internet entailed a huge increase of networked applications and the apparition of multiple, various protocols. Their functioning complexity is very variable implying diverse performances. The first objective of my Ph.D is to evaluate precisely the robustness of those networked applications, which are known to be very efficient and seem scalable, like for instance, the botnets. Hence, several botnets protocols are imitated. Furthermore, protocol reverse engineering has skyrocketed because many protocols are not always well documented. In this domain, the first necessary step is to discover the message types and this work introduces a novel technique based on support vector machines and new simple message representations in order to reduce the complexity. Finally, there are multiple applications for a single protocol which can be identified thanks to device fingerprinting techniques whose the domain of application is related to security and network management. The first technique proposed in my Ph.D thesis can work with the previous contribution about reverse engineering because the devices could be identified only based on the types of messages exchanged which are aggregated into a temporal behavioral tree including message delays. Besides, the syntactic tree structure of a message is also a good discriminative feature to distinguish the different devices but was very little considered until now. Available at http://tel.archives-ouvertes.fr/tel-00442008/en/.



  • Office: B136
  • Madynes - INRIA Nancy Grand Est
  • 615 rue du Jardin Botanique
  • 54600 Villers-lès-Nancy, FRANCE
  • Phone: +33/(0) 3 83 59 30 66