Maximizing Ham – Focal Curve

I have seven email addresses: two at this domain (one personal, one administrative), one at Yahoo, one at Hotmail, one at Gmail, one spamtrap at facehugger, and one at work. All of these addresses combined get an average of about 5 spams per day. Yes, I said five. Even the spamflow into my facehugger trap has tapered off dramatically in the past few weeks, down to one or two per week. I’m not completely spam-free, but I’m probably as close as I’ll ever get.

Alas, I don’t get much real email either, but should this blog ever take off and reach a broader audience, I can expect my mail volumes to spike dramatically. It’s a price of fame I am willing to pay. I would love to get more email from interested folks who read my site. Even hate mail would be amusing fodder for public ridicule. But I don’t want spam. I hate spam, and will not tolerate it. Even the remarkably few spams I get fill me with rage and disgust, so I have no intentions of exposing myself to an exponential tide of the stuff.

spam over ham

So how can one keep the channels open to emails from real humans while camouflaging oneself to evade spammers? Mike over at FiftyFourEleven posted yesterday on the topic of defensive communication, with a few ideas for preserving sanity in the face of high mail traffic. Mike’s emphasis is on reassuring your human senders that you are indeed available, but I think the spam evasion issue deserves some deeper exploration. The challenge is to make yourself reachable and available for the ham you want while at the same time protecting yourself from the deluge of spam you don’t want.

The ubiquitous contact form

Every site has one, including mine. It’s an easy and convenient way for visitors to contact you, and it can also be a nice way of hiding your email address from spambots. By far the most widespread webform-to-email processor is Matt’s FormMail script, a little bundle of Perl that lives in your cgi-bin and passes the submitted form data to your SMTP server. However, the original FormMail script and its variants embed the receiving address in an <input type="hidden">, meaning your sensitive email address is plainly visible in your HTML source. A browser won’t display it, but a spambot will find it in a heartbeat.

Alternatively, the script I use is called Master Feedback, which embeds the recipient address in the script itself rather than in the HTML. A cgi-bin will not be directly accessible on a properly configured server, so there’s no way for anyone to crawl through the Perl and find your address. Works like a charm. Whatever method you use to process your forms, just be sure your email addy is well hidden.

An additional bonus of using contact forms is that the mail you receive is essentially coming from yourself, or at least from a known domain that you can whitelist. It’s also possible to embed a subject line or a string of text in the message body which can be passed through client-side rules to help you sort your mail (or auto-reply as Mike suggested). The risk of losing a worthwhile message to a false positive is practically eliminated with a contact form and some some intelligent filtering.

The major downside of contact forms is that they’re essentially anonymous–you don’t truly know where the mail is coming from. You can ask the submitter to enter an email address, but there’s a good chance it will be a fake one. Even worse is if the submitter enters someone else’s address when they send their hate mail. Requiring confirmation of a valid address before sending a simple “Hi there” is just lame. Besides, the simple fact that you’re soliciting any information from your visitors can raise privacy concerns. On a personal site or blog that’s not so risky, but a corporate site that takes any information from the user must have a clear privacy policy available.

The human-readable address

One more drawback of contact forms is that you’re imposing your rules upon the user, limiting their choices for how to reach you. If they really want to use their own mail client, you still need to present a working email address to the public. Spammers employ crawling software to scan web pages looking for the magical @ sign. They then cull the word to the left as a username, and whatever dot-separated words to the right as the domain and extension to form a single email address. They rely on the standard “user@domain.tld” formatting of an email address, so they can usually be fooled by altering that format.

The classic mailto: link is an endangered animal in these spam-conscious times, so one alternative is to munge the address, obscuring it in some manner that a live person can decipher but that prevents automated page crawlers from recognizing it. Some examples:

smackfu -at- example [dot] com
rivitz@ @ @example….com–
brassdog@ e x amp l e dotcom
fuzzmuttTROUTexample.com minus fish plus @

The possibilities are endless. Of course, the more obscured your address is, the more work you’re punting off onto your visitor to parse and rearranged it into a real address. (As an aside, you should be aware that just inserting “nospam” into an address is no longer an effective deterrent to spambots. I got spam sent to one disposable address with “no” appended to the username, the spambot had simply omited the word “spam”.) Munging is a nice quickie solution, but the better choice is encoding.

An email address can be encoded on a web page using HTML entities and/or ASCII equivalents so the browser will display it properly, but an unintelligent spider program can’t make sense of it. You can even encode the markup of a mailto: to produce a functional link. The crafty minds at Hivelogic have helpfully automated this in their handy dandy Enkoder utility, outputting a wad of nonsensical gibberish that a browser will convert into a working email link. Neat.

Aggressive filtering

Another viable (though risky) option is to simply put your address out there for the world to see and just deal with whatever comes. Most email addresses are spammed within hours of being published on a website, and you can expect even more spam to be shoved your way the more popular your site is. If Google can find you, so can spammers. If you opt for making your address public, prepare to suffocate under hundreds of daily advertisements for discount V1ag’r4. You’ll surely want some smart filtering to help you separate the wheat from the chaff, but filtering isn’t all it’s cracked up to be.

The broader and stricter your spam filters are, the greater the risk of false positives. Bayesian filtering software will learn from your spam and adapt its rules intelligently, reducing that risk, but false positives still happen. Blocklists are effective yet somewhat draconian, and there’s always the chance that innocent bystanders will be blocked just because they happen to share an IP range with a spammer. Going pure whitelist (meaning you accept email only from approved senders) stops all spam but is just not practical and goes directly against that whole open channel thing. Challenge-response systems (where you demand proof of a sender’s humanity before accepting their email) is similarly disruptive, and just generates even more email on top of the spam, usually sent to the wrong person since spammers forge the From: header.

And of course, the real problem with filtering is that it doesn’t prevent the spam from being sent. Client-side filtering takes much of the drudgery out of sorting mail, but still forces you to download it. Server-side filters like SpamAssassin will scan and tag spam before you see it, and you can even configure your mail server to deliver positives to another mailbox to cut down on what you have to download. It’s unwise to automatically reject or delete mail flagged as spam, just in case there’s one of those false positives in the junk pile.

In the end, the only way to keep your inbox free of spam is to make sure spammers never find out you have one. Keep your email address as private as you reasonably can, obscure it when you publish it, and deal with the spam when it (inevitably) comes. The beauty of email is the ease with which we can communicate with each other, cheaply and instantly. It’s unfortunate that unscrupulous, unethical, inconsiderate marketroid shitbags had to ruin it for all of us.

1 comment

Jay says:

8/24/2004 at 11:53 am

Great article and your timing is great. I was just contemplaing how I should go about adding contact info to my website. Lots fo food for thought. I guess it’s pretty clear simply putting a mailto: link isn’t the best choice n the long run. Fliters just become cumbersome. Thanks for the info.

Comments are closed.