Another approach to webapplication fingerprinting
Last Updated: 2018-04-30 11:30:50 UTC
by Remco Verhoef (Version: 1)
It has been always important to fingerprint webapplication versions, but now in light of the recent Drupalgeddon attacks (Yet Another Drupal RCE Vulnerability and More Threat Hunting with User Agent and Drupal Exploits) it is even more important. Naturally we can detect the website by just checking the version number, but that’s not always that precise. There are a lot of applications that disable version identification or give false information, which would give a lot of false positives for possible vulnerabilities.
Websites can be identified using files like README, stylesheets, templates and other static files. Some existing tools identify version information from the outside using these, but results are not always that precise. They depend often on a (small) pre-generated database of hashes of files. A while ago I worked on another approach, using the original (git) repositories. The small (proof of concept) tool called Identify (http://github.com/dutchcoders/identify) uses not just a predefined list of hashes, but uses the original opensource repository. Being up to date to the latest version is important, and using the actual repositories will keep the tool actual. Plugins for applications like Wordpress are also within public repositories (eg http://plugins.svn.wordpress.org/) and can be probed in the same manner.
Another advantage of using the repositories is that semantic versioned tags are being used, which makes it easy to identify accompanying vulnerabilities against the CVE database.
Identify first clones the (git) repository of the opensource content management system. Next it will do several http requests (using a predefined list, this is one of the improvements that can be made) against the target website, generate a hash and compare it against all other hashes in the repository. Those resulting files and hashes will be stored in a list, narrowing down the amount possible versions and outputting the results.
As this is just a proof of concept, I’m working on some additions to this tool. First I’d like to generate a database containing relevant hashes of static files of several repositories, which will be updated on each start. Using this database and comparing all files and hashes we could narrow the list down of uniquely identifying applications and versions, and therefore the amount of requests we need to do on the target site to identify. Less requests are better.
If the list has been narrowed down to just a single version, the next step will be to implement the CVE database to show the vulnerabilities to that specific version.