Leo's MovableType Tips

by Leo A. Notenboom

by Ask Leo!

Dealing with Comment Spam

Comment spam is the bane of the modern weblog. In an effort to boost their own ranking in the search engines, nefarious webmasters create software tools or "bots" that go out, find weblogs, and post scores of comments which contain links back to their sites. Those links are then seen by the search engines, which use the general rule that "the more people link to you, the more important you must be". The goal of comment spam is that the spammers' websites about poker, Viagra and body part enlargement get better search engine ranking as a result.

As weblog owners, not only do we not want to help the spammers, but we also just don't want their off-topic posts included in our weblogs.

I've tried several techniques to combat comment spam. The most effective I found prior to switching to MT 3.1 was something called a "captcha" test (completely automated public Turing test to tell computers and humans apart) - you've seen them, they're the tests where before you can do something you need to type in the letters presented in a distorted image. Computers can't read the image (yet), but humans can. I used the captcha plugin developed by James Seng with great success.

MT 3.1 did some major rework on comment entry and user validation in order to enable TypeKey. Now, TypeKey is fine, but to me it seems yet another place people would have to log in, and yet another barrier to legitimate comments. Unfortunately that rework seems to have broken the Captcha test, and I was unable to get it to work.

So, after all that backstory, here's my current approach on Ask Leo!:

The goal of all that is to rename the comment submission script, and make it difficult for the spam-bots to determine what it is. Most spam-bots either assume the name of the comment script and just post directly to it, or they scan a weblog entry for the <FORM> and read the name of the script from the "action=" attribute.

In my case, the default comment script name doesn't exist. The "action=" attribute points to a completely different page ... one that, if invoked, is nothing more than a static html page. (Have a look: http://ask-leo.com/commentspam.html).

The magic behind this trick is the javascript in comment.js. When the <script> line that references it is loaded by a browser, the script executes, and changes the "action=" for the form from the static html page to the "real" comment script, which I've currently named something.pl.

In order for a spambot to successfully start comment-spamming me again, they would need to do either of two things: begin parsing javascript, or actually step through using the form on the entry page, rather than attempting to go directly to the comment submission script.

Now, it's important to notice that this technique is not fool proof - quite literally, actually: any fool can manually enter a spam-ish or abusive comment by hand, and there are definitely fools out there. The tools the spammers are using eventually overcome the roadblocks that we put in front of them, especially as specific techniques become more common. I fully expect that this technique will get broken, and I'll end up obfuscating my comment posting approach in some other way.

But for now, zero automated comment spams have been posted in the several months this technique has been in use.