This article from Wired exposes the amusing phenomenon of geeks comparing spam scores. SpamAssassin is one of the more popular spam filters which scans the content of emails and tallys a score based on the detected spamminess of each message. Major red flags such as the phrase “penis enlargement” or the word “offer” in a url earn high points and mail clients and servers can be configured to discard messages with especially high total spam scores. Given the tendency for geeks to relish accomplishments that don’t interest normal people at all, it was only natural that people would start comparing their spammiest spams.
Focal Curve’s mail server runs SpamAssassin but luckily my inbox is relatively spam-free (knock on wood.) And actually just a day or two before I came across this article I purged my junk mail folder. However, I do own a honeypot address, monty@facehugger.com. Go ahead and harvest it, spambots, see if I care. Facehugger.com is one of several domains offered for free webmail by Dark Horse comics. It’s especially good for spam sampling because it has very robust user-controlled filters, it’s easy to view the original source of messages including full headers, and they recently installed SpamAssassin.
Monty gets about 5 spams per day, most of which apparently come from the same two or three spammers since they always follow a similar pattern, not even considering the frequent multiple identical spams hawking the same site/product from different forged addresses and routed through different overseas relays. Most of those spams score in the 40s and 50s on SpamAssassin, and that’s subtracting the 100 points automatically assigned to spams from blacklisted senders.
A spam score higher than 20 is pretty remarkable, and most of the spam I get at my “real” email address scores only around 8-10. So why does Monty get so much spam in the 40-60 range? How stupid and clueless is this persistent and undaunted spammer? Clearly, very. Here is the source of the highest-scored spam currently in storage. It was sent to multiple recipients in the facehugger domain, indicating a likely dictionary attack, but in case some of those other addresses belong to real people I’ve deleted the usernames, no reason for them to get reharvested here.
As a geek and a vehement spam-hater, I can certainly understand what the Wired article is talking about. There’s some odd appeal to spam scoring, some twisted sense of achievement from capturing such a remarkable specimen from the wild. And assigning a score immediately lends itself to competition. Breaking a message down into neatly labeled elements of spam and valuating them based on their spammy indicativeness just opens up a whole new view of how much evil planning goes into invading my privacy. Look how many RBLs this guy is listed in. Look how unabashed he is, shamelessly concocting such a spammy spam apparently with no serious effort to appear like legitimate email. Truly, the stupidity is something to be marveled at.
The above sample, though high scoring, certainly isn’t the most dastardly spam poor Monty has been suckerpunched with. Last week he captured a really nasty one with big sharp pointy teeth. While it only scored 23.42 points, it employed every dirty spammer trick in the book. A madeup name and forged From: address goes without saying. But deeper than that, the message itself was base64 encoded so viewing the source only shows a bunch of “PGh0bWw+PGhlYWQ+PHRpdGxlP” nonsense. Running the message body through the base64 decoder reveals that it was further encoded as “quoted printable,” breaking up the HTML with a bunch of annoying “3D=” tags in a further effort to obfuscate the real message and confuse automated filters.
Then digging into the HTML itself I saw that the entire message consisted of a single image sourced from a web server so the actual message content couldn’t be analyzed, and in the url of that image source was the name “monty@,” meaning that the spammer can see the access logs of the server hosting the image file and know which recipients opened his spams, thus validating the victim’s address as a target for more spam. The image was also a link to the spamvertised website and the link passed Monty’s email address as a parameter for target validation and order tracking.
And on top of all that, the bottom of the message had this small slice of seemingly normal text (quoted from Hitchhiker’s Guide to the Galaxy), an effort to fool Bayesian filters. “‘This must be Thursday,’ said Arthur to himself, sinking low over his beer, ‘I never could get the hang of Thursdays.'” That is probably the spammiest single spam I have ever received. All of that just to hawk some shady debt consolidation scheme I never asked for and have no interest in. I can only shake my head in awe and disbelief.