Pete's Odyssey

    A website and blog by Peter Lewis

Spam

Take that, comment spam!

Ha! I've done something about the comment spam I've been getting lately. Readers of this site might have noticed the occasional comment showing up at the bottom of the blog posts full of spam links, mainly pertaining to sex and loans. I've been doing a pretty decent job of deleting the comments when they do arrive, but their volume has been increasing quite a bit lately. Time for action.

There are a couple of different approaches which can be taken to this kind of thing. The first option is some kind of automatic spam filter, often based on a machine learning process, or a look-up table of known spammers. I'm not a fan of this kind of thing. The primary reason for this is how do I know that it's not getting it wrong? A couple of spam emails slipping through the net is okay, I can just delete them, but what about any falsely categorised solicited emails? Email is important to me, and I don't want to have to start worrying about that. After all, I wouldn't get someone to throw out my post for me if the envelope didn't look right. So, I don't use a spam filter for my email. To be honest though, I don't get too much of it anyway (a handful of emails per month). I attribute this to being quite vigilant in not letting my email address get out there on the web in plain text format. So far, it's worked very well.

So, I don't want to go down that route with this site either. I also like the fact that anyone can comment, without any approval from me, so that isn't on the cards. I've gone instead for the option of a captcha. Regular internet users will probably be familiar with this kind of idea. It's basically a variant on the idea of a Turing Test (perhaps). In essence, you, comment poster, will now be required to prove that you have a brain, and that, to some extent, it works. Oh, and that you have studied some kind of maths, at least to primary school level. The captcha module for Drupal, which I use for this site, defaults to using a simple maths problem, such as 2 + 8. Answer correctly, and your comment will be duly posted, but answer incorrectly, and it will not. Simple, eh?

Of course, as I have already alluded to, this does also discount people with very poor maths skills. Another, albeit small, widening of the digital divide? Actually, some sites which use more complex visual captchas do have to provide accessible versions too. A graphical one is not that useful, for example, if you use a text-to-speech style browser.

The idea also assumes that computers generally can't solve the problems being asked. Now, of course, 2 + 8 is perhaps even easier for a computer than a human, but the key here I imagine is that the computer isn't expecting the question. This could of course be combated by determined spammers though. As the techniques both for designing and for recognising captchas improve, will a problem be found that is provably impossible for a computer to solve? Or is the long term future of open digital communication doomed to wallow in can upon can of spam, or else rely on automatic filtering?