Bayesian Kryptonite - spoofed email

I use POPFile bayesian filtering to keep email spam at bay. With a little training, this works amazingly well-- I'm at 99.8% accuracy, and that's with a little over a month of "training" precipitated by a recent server migration. But bayesian filtering has one big weakness that I'm seeing more and more: spoofed emails.

You know what I mean-- emails titled Your Account Has Been Violated from, ostensibly from service@paypal.com. The body is a direct cut and paste from a real PayPal email:

Security Center Advisory!

We recently noticed one or more attempts to log in to your PayPal account from a foreign IP address and we have reasons to belive that your account was hacked by a third party without your authorization. If you recently accessed your account while traveling, the unusual log in attempts may have been initiated by you.

If you are the rightful holder of the account you must click the link below and then complete all steps from the following page as we try to verify your identity.

Of course, the spoofer is desperately hoping you won't notice that the crazy URLs in their email ..

http://paypaldemo.com.previewyoursite.com/source/service/ema/helpextsourcepage/PaypalISAPIruhttp3A2F2Fmyebamcom3A802Fws2FeBayISAPIdll3FMyeBay26ssPageName3DH253AH253A/
http://ebay.doubleclick.net/clk;13012399;10693575;h?http://cardsavetransfer.com/cmdr_login/index.htm
http://ebay.doubleclick.net/clk;13012399;10693575;h?http://paypalcardstraznact.com/cmdr_login/index.htm

.. aren't actually pointing to paypal.com (or ebay.com), and you'll key in your account and password on their servers.

These spoof emails contain so-called "kryptonite" because they so closely mimic actual emails from PayPal with valid words and phrases. Bayesian filtering is useless against this type of spam; if the spammer knows what any email in your actual inbox looks like, he can construct one that will beat any Bayesian filter. This is a a strict requirement at the very heart of bayesian filtering itself; any knowledge of valid contents (eg, things that "get through") has to be strictly eliminated.

I usually just delete these emails from my inbox; what else can I do? One thing is for sure: popular web-based services can no longer communicate via email with their customers. That's like giving spoofers a free pass; once they have the "template" email they can copy and paste it into a spoof email that is almost guaranteed to get past bayesian filtering for users of that service.

eBay, for example, has almost given up altogether on email communication. You have to visit eBay.com and check your web-based "message center" to communicate with them. I can't say I blame them; what other choice do they have?

Read more

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood · · Comments

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven't any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don't stop there - read the Story of Your Life novella it was based on

By Jeff Atwood · · Comments

I Fight For The Users

If you haven't been able to keep up with my blistering pace of one blog post per year, I don't blame you. There's a lot going on right now. It's a busy time. But let's pause and take a moment

By Jeff Atwood · · Comments

The 2030 Self-Driving Car Bet

It's my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger

By Jeff Atwood · · Comments