Websites offer two vectors for spreading email spam: email addresses and insecure forms. Bad guys and their botnets crawl the Web, looking for both. With email addresses they build lists. With forms they hijack servers to send spam.
Let’s take a look at how you can offer better protection in both areas.
Protect your email addresses. In my experience, when email addresses are hidden, the flood of spam subsides within a few weeks. But completely removing your email addresses can inconvenience legitimate users who may want to contact you. With a simple code trick, you can display addresses to legitimate browsers while hiding them from spam bots.
Start by identifying all pages on your Websites that contain email addresses. Ask your Webmaster to encode every email address in JavaScript. Processing JavaScript takes substantially more sophistication than simply scanning HTML text looking for string@string.tld, the pattern of a typical email address. Modern browsers can handle JavaScript; most spam bots cannot.
JavaScript can generate a clickable email address in a Web page. Be sure never to include whole email addresses in your HTML code. Mildly convoluted JavaScript code, such as assigning pieces of your email address to variables, and then concatenating those variables to produce output, will fool virtually all spam bots. Each Website would ideally use different code to make it harder for spammers to spot patterns.
Secure your forms. Web forms, such as those commonly found on Contact Us pages, are frequently abused by spammers who want to use your server to send mail to victims. Mail header injection via PHP can trick your server into sending spam messages to third parties with a return address pointing back to you.
So if you ever receive Web form submissions full of random nonsense, that’s a sign of trouble. Your server could be contributing to the worldwide flow of spam.
For securing Web forms, I like two strategies that do not use obtrusive Captchas. One method is to scan every form input field for injection attack strings like “bcc:” and “mime-version:”. Bona fide users are highly unlikely to type things like that into a contact form. The second method employs honeypots.
Honeypots can identify spam bots by looking for actions that a human would be smart enough to avoid. In my experience, honeypots are surprisingly effective. To create a honeypot, just add an extra field to your form, and set the CSS property “display:none” on its parent element. You may also want to label the field with a warning like “Spam trap, keep blank” in case a user has a browser that does not support CSS.
When the form is submitted, check if the honeypot field is blank. Spam bots fill out every field on the form, and they generally don’t waste the processing power required to process CSS definitions or natural language. When the honeypot field is not blank, you’re probably dealing with a spammer.
For best results, I combine field scanning with a honeypot. My code toolbox is available here.
A word of caution: Simple security is often good enough for low-value targets. My suggested methods could be evaded by a determined malefactor, so they cannot be relied upon for high-value forms like those used to process financial transactions. That said, I have used my scripts on numerous Website contact forms over the years with virtually no user complaints and no spam.
If enough Websites made basic attempts to protect their email addresses and forms, spammers would lose revenues, or they would start processing JavaScript and CSS. Due to the low response rate on spam, spammers need to process huge amounts of data to make money. Technologies like CSS and JavaScript that add extra computation to each transaction can make spamming significantly more expensive, and thus less profitable.
I doubt the methods outlined here will ever win the battle against spammers, but they could keep your inbox cleaner and save your users from the hassle of obtrusive Captchas.
— Jonathan Hochman, founder, Hochman Consultants
This blog is part of Internet Evolution's IT Clan, which addresses the continuing impact of the Internet on enterprise networks, applications, and management. Register here to join the IT Clan's conversation, and you just might win something unspeakably cool.