Coding Horror

programming and human factors

CAPTCHA is Dead, Long Live CAPTCHA!

In November 2007 I called these three CAPTCHA implementations "unbreakable":

Google
(unbreakable)
captcha-decoder-7.png
Hotmail
(unbreakable)
captcha-decoder-8.png
Yahoo
(unbreakable)
captcha-decoder-9.png

2008 is shaping up to be a very bad year indeed for CAPTCHAs:

Which means I am now 0 for 3. Understand that I am no fan of CAPTCHA. I view them as a necessary and important evil, one of precious few things separating average internet users from a torrential deluge of email, comment, and forum spam.

So reading that the three best CAPTCHA implementations have been defeated sort of breaks my heart. Even what I consider to be the strongest, Google's implementation, fell hard:

On average, only 1 in every 5 CAPTCHA breaking requests are successfully including both algorithms used by the bot, approximating a success rate of 20%.

A twenty percent success rate doesn't sound like much, but these spammers are harnessing networks of compromised PCs to send out thousands upon thousands of simultaenous sign-up requests to GMail, Hotmail, and Yahoo Mail from computers all over the world. Even a five percent success rate against a particular email service CAPTCHA would be cause for serious concern; with twenty percent success rate you might as well put a fork in that thing-- it's done.

In the meantime, CAPTCHA still serves a useful purpose-- speed bumps that prevent evil bots and the nefarious people who run them from completely overrunning the internet, as Gunter Ollman notes:

CAPTCHAs were a good idea, but frankly, in today's profit-motivated attack environment they have largely become irrelevant as a protection technology. Yes, the CAPTCHAs can be made stronger, but they are already too advanced for a large percentage of Internet users. Personally, I don't think it's really worth strengthening the algorithms used to create more complex CAPTCHAs – instead, just deploy them as a small "speed-bump" to stop the script-kiddies and their unsophisticated automated attack tools. CAPTCHAs aren't the right tool for stopping today's commercially minded attackers.

There's simply too much money to be made in email spam for the commercial CAPTCHA algorithms, regardless of how good they may be, to survive forever. How old is Google's CAPTCHA now? Two to three years old? In the short term, perhaps proliferation and evolution of many different CAPTCHA techniques is the most effective prevention. You should emulate the techniques from the most effective and human-readable industrial grade commercial CAPTCHA, but avoid copying them outright. Otherwise, when they're inevitably broken, you're broken too. CAPTCHA defeating tools are tailored to very specific inputs; if there's little to no monetary incentive, odds are nobody will bother to customize one for yours. My ridiculously simple "orange" comment form protection is ample evidence of that.

Beyond diversification, the deeper question remains: how do we tell automated bots from people-- without alienating our users in the process? How can we build a next generation CAPTCHA that's less vulnerable to attack?

Here's some food for thought:

At some point, unfortunately, CAPTCHA devolves from a simple human reading test into an intelligence test or an acuity test. Depending on how invasive you want to be, you'll eventually be forced to move to two-factor authentication, like sending a text message to someone's cell phone with a temporary key.

I don't have the all answers, but one thing is for sure: I hate spammers. As fellow spam-hating internet users we all have a vested interest in seeing CAPTCHA techniques evolve to defeat spammers.

Written by Jeff Atwood

Indoor enthusiast. Co-founder of Stack Exchange and Discourse. Disclaimer: I have no idea what I'm talking about. Find me here: http://twitter.com/codinghorror