Email Harvesting and PrivacyDeciding whether and how to post email addresses on websites is an old problem. In internet time, that means several years. We do know a little more now than we used to.

To review, the primary risk is that email addresses that are simply posted as text or html can be harvested by highly efficient internet bots. Acting like web crawlers, these bots detect and record email addresses found on websites. They can harvest thousands of email addresses per minute.

Once someone possesses email addresses, they can be used maliciously in several ways. They can be used to address spam. Phishing messages can be sent to the addresses. In addition, since email addresses are often used as usernames, they can be part of a “brute force” password-guessing attack to acquire the user’s credentials.

Several alternatives exist so that visitors can get a message to a staff member. One of the most popular once was to word the email address on the website slightly differently, replacing the @ sign with “at.” In this way, you use “tim at cgnet.com” instead of tim@cgnet.com. The problem with this is that bots have gotten smarter and can now harvest these types of email addresses, too.

Another approach is to create a transparent gif graphic for each email address, inserting the graphic instead of the text. The address-harvesting bots will usually not recognize that the gif contains a picture of text. I say “usually,” however, become some bots using text recognition have been reported. Apparently, however, these are not widely used, so the risk of harvesting is said to be low.

For users’ convenience, you’d like to associate a mailto: link with the graphic, so that clicking on it would pull up your email application’s message window. This is not good if you link to mailto: addresses inyour website’s code. If you refer them to addresses on another server, however, so that no email addresses are on the site, it’s OK.

The downside of this approach is if your site is being specifically targeted by an adversary, the bad actor can simply manually copy the list of your staff’s email addresses. This is much less likely than being scanned by a harvesting bot, however, unless your site’s activities make it a special target for activist hackers.

Another solution is to use a form on a web page, either to contact a staff member, or to send a general message to the organization. This eliminates the use of email addresses altogether and can also allow you to collect other information about the visitor.

Contact forms are very popular. More than 1,000 contact form plug-ins exist for WordPress alone, I’ve read.

There is a downside here, too, however, because forms must be well crafted to protect against several attacks, such as cross-site scripting. It is possible to create a secure form, but it involves knowledge and work on the part of your web developer. Several versions of WordPress contact plug-ins have been found to be insecure.

Adding a CAPTCHA feature to an email address list is helpful. In this scenario, the user must go through a CAPTCHA routine, such as identifying the right pictures in a group, before the link to the mailto: is activated. CAPTCHA, or the Google-based reCAPTCHA, can also be used to keep bots from entering spam into the contact form. This option does make the user work a bit harder, but it works.

But wait, there’s more! Since the enactment of the European Union’s General Data Protection Regulation (GDPR) and similar laws elsewhere, such as California’s AB 375 (2018), collecting personal data, such as information about the visitor, raises new considerations. Many organizations may not be affected by these laws, since the GDPR applies only to information about European residents, and the California law does not apply to nonprofits.

In general, however, people are becoming more sensitive about privacy issues, and it could bring some credit to the Foundation to adopt careful privacy policy and procedures, if it decides to use forms.

If the Foundation is subject to the GDPR, or if it wants to offer privacy protection as if it were, there are several necessary procedures, such as making it possible for a person’s information to be removed upon request, posting the policy and procedures, and ensuring that permission is explicitly granted for the information to be collected.

Probably the best alternative today is to use a contact form plug-in with good security features and ratings, if your system supports them, and you don’t have privacy compliance issues. Otherwise, a CAPTCHA-protected list may make the most sense.

Translate »