When Programmeroo got launched, initially the decision wasn’t to use CAPTCHAs on any forms. Because, let’s face it, CAPTCHAs are still annoying and have a huge effect on user experience. Another reason was to get to know the spammers better, find out where they’re coming from and identify their patterns beforehand. So it’s no surprise that soon enough, the contact form started getting a lot of spam messages. That’s where the fun began and the rest is the story of how we went from a few spams a day to almost none by using a few simple techniques without any usability impact on the end user.

Smarter Server-side Validation

When it comes to defeat spams there isn’t much help you can get from client-side validation. Most of the time the bots send HTTP POST requests directly to the server. So you can focus more on server-side validation instead. One of the first things we saw early on was a lot of duplicate submissions coming from the contact form within a small window. To tackle this a validation rule got implemented to ignore those messages that are already being submitted within a short timeframe. Also after comparing legit vs. spam messages it turned out that a number of spammers tend to send lengthy messages. To address this we have started checking the message length against a defined minimum and maximum range.

Blacklist Mechanism

Another approach was to flag spammers and block their IP in application level. For that, we implemented a simple yet powerful open-source PHP library called Guard. So whenever a contact message is marked as spam the author IP gets stored in a database. Then every new submission is checked against the blacklist database and is ignored if there is a match. To minimise the impact on performance of the form, Guard uses a MongoDB driver. In addition to IPs, Guard is flexible enough to manage other entities such as name, email, etc. One of the drawbacks of using this access control mechanism is false positives. In other words, you may end up blocking a legit author. Also spambots usually use multiple IPs so you’re not able to block them completely.

Honeypots

Because we’re still getting spams we added three honeypot fields:

Hidden Field With an Empty Value

<input type="text" name="dare_to_enter_1" autocomplete="nope" tabindex="-1" hidden=""> 1 < input type = "text" name = "dare_to_enter_1" autocomplete = "nope" tabindex = "-1" hidden = "" >

<input type="text" name="dare_to_enter_2" autocomplete="nope" tabindex="-1" style="position:absolute;left:-2000px;"> 1 < input type = "text" name = "dare_to_enter_2" autocomplete = "nope" tabindex = "-1" style = "position:absolute;left:-2000px;" >

Many spammers tend to fill out all the fields including hidden ones. So in this case, the submission is ignored if dare_to_enter_1 or dare_to_enter_2 field has got a value in the POST request. Please note the two fields are hidden using different techniques.

Hidden Field With a Value

<input type="text" name="dare_to_change" value="dare_to_change_value" autocomplete="nope" tabindex="-1" hidden=""> 1 < input type = "text" name = "dare_to_change" value = "dare_to_change_value" autocomplete = "nope" tabindex = "-1" hidden = "" >

In this case, the submission is ignored if dare_to_change has got any value other than dare_to_change_value .

Time-based Protection

At this point, the number of spams reduced dramatically but not completely. So we tried another approach again based on differences between human and bot interactions with a form. Bots tend to submit a form immediately whereas it takes longer for a user to go through the fields and fill out a form. So we’ve started ignoring submissions where the time to post is less than X seconds. Following is an example of a hidden field in Laravel Blade:

<input type="hidden" name="loaded_at" autocomplete="nope" value="{{ $currentTime }}"> 1 < input type = "hidden" name = "loaded_at" autocomplete = "nope" value = "{{ $currentTime }}" >

In this example, $currentTime is the timestamp when the form is rendered and is passed from controller to view. Then on form submission this value is compared with the time that POST request is received in the server. If the difference is let’s say less than five seconds, it indicates that the form is not being submitted by a human.

Where To Go From Here?

It’s been a while for us to receive spams, but if you’re dealing with more sophisticated bots there are even more options to look at. For example, using Guard you can block a submission by phrases in addition to IPs. Also, if you need to use a different type of storage rather than MongoDB feel free to create your own driver by extending AbstractDriver class which can be found here. PRs are always welcome! You can also consider integrating Akismet’s spam detection API into your application. Google has also introduced score-based reCAPTCHA v3 recently which is completely different from the previous versions where users needed to pass a challenge. Hopefully our experience is helpful for you to tackle the spams. Good luck and happy spam haunting!