Last Updated: 2019-10-01 17:25:00 UTC
by Johannes Ullrich (Version: 1)
As pretty much everybody else allowing comments, our site is getting its fair share of spam. Over the years, we implemented a number of countermeasures, so it is always interesting to see what makes it past these countermeasures. There are a number of recurring themes when it comes to spam:
- VPN advertisements (we also get A LOT of offers from individuals asking us to post their latest VPN comparison)
- Outlook file converters. Odd how often they show up.
One recent technique is the use of excerpts from the article as a comment. The intention may be to bypass various spam filters by hitting the right keywords. I was interested to see if I can learn a bit more about how these spam messages are submitted.
The particular comment I looked at advertised a "Wordpress Security Blog". A search of a random text snippet from the site shows about 100 copies of the content indexed by Google. The comment was left earlier today, at 9:38am UTC. The account was actually created about 4 1/2 hrs earlier and used a Protonmail e-mail address. The user agent is plausible:
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0
This points to Firefox 69 on Windows 7. So this is a current version of Firefox on a bit an older (but still used) version of Windows. The IP address the comment was posted from appears to be located in India, but the email address suggests a Russian user. All hits for this user came from the same IP address.
What's a bit odd: As you sign up for an account to the site, your email needs to be verified. I counted 4 email verification attempts. Only one attempt was made to leave a comment, which of course never got approved.
The weblog entries left by the user match a normal browser. All images, style sheets, fonts, and similar files are loaded. The only odd thing is that the first hit is for a password reset URL. It took about 40 seconds from loading the diary post to leaving a comment, which is about "normal". If this is a script, then it does a good job in delaying its actions to look more real. In some cases, it may be a bit too precise. The spammer also left two comments in our glossary, which were exactly one minute appart (ok. 62 seconds).
So this looks like a likely at least semi-manual (or "machine-assisted") spam campaign. Here are a couple of things we do to keep spam to a minimum:
- You need to log in to leave a comment. This is probably the largest deterrent.
- To set up an account, we verify an email address.
- we use a stupid simple captcha for the signup process (I find them to work better than standard captchas)
- New user's comments need to be approved.
With these countermeasures, the spam comments are very manageable. If you ever see that there are "x" comments for a post, but less are visible: This indicates a comment may be waiting to be approved or got marked as spam. Have to fix that counter sometime.