Last Updated: 2019-12-30 01:22:19 UTC
by Johannes Ullrich (Version: 1)
Much of the data offered by us is available via our API . A popular feature of our API is our "threat feeds." We use them to distribute lists of IP addresses and hostnames that you may want to block. In particular, our feeds of mining pool IPs and hosts used by Shodan are popular. This weekend, I added a feed for Onyphe . Onyphe is comparable to Shodan, and I do see a lot of scans from them lately, which is why I added the feed. While I was messing with the API, I also added the ability to retrieve hostnames in addition to IP addresses.
You can get historical data by adding a start and an end date like:
For all of our API functions, you can switch the output format by adding ?json, ?php, ?txt, ?tab to the end of the URL. The last format (tab-delimited) is only available for some data that is suitable for a tab-delimited format.
Should you use these lists to block attacks? Up to you. I find there is little to be gained, but it may help you keep the noise down in your logs. Before you start to block IPs in the list, I recommend logging first for a while to help you identify false positives. We are planning on adding a lot more "API features" to make our data easier to consume. Per our "creative commons" license, you may use the data to protect your own network (commercial or not) as long as you do not resell the data. As always, the data is offered "as is." I find the data most useful as input to a SIEM to quickly identify source IPs that are part of these scanners.
Onyphe made it reasonably easy to collect the data. Not only do they use (like Shodan) hostnames within their onyphe.io domain, but they also offer a list of IPs and networks at any of their scanner IP addresses. So once you know one of them (for example http://barker.onyphe.io/ ), all you need to do is parse the content of that page. There is, of course, no guarantee that this list is complete. There are a couple of other sites that offer data from internet-wide scans that make it quite difficult to identify their scanners. (something I am working on with the data we get from our users.)