Revelations about captcha solving services, bots and how to really stop spam

Captchas, anti-spam questions, e-mail confirmation - do they really stop bots?

People usually don't know much about bots and their capabilities. Some believe that a bot can never register in their website if they have a math question "2+4" or a captcha image with text or just a simple question like "Are you a human?". In fact, these methods can stop only part of the bots. Well, at Devseon we are experienced automation experts and website creators and we decided to share some of our knowledge with you and then offer you some anti-bot techniques.

Captchas

Have you heard about captcha solving services?

These are companies that offer human solved captchas. For example a bot fills every field and when it gets to the captcha image it sends it to the captcha solvers, they solve the captcha and send the text back to the bot which puts it in the field.

Does it work?

Yes...Here are some infamous websites that offer the captcha solving services:

deathbycaptcha.com

de-captcher.com

bypasscaptcha.com

imagetyperz.com

They are popular enough to cause trouble for your website.

What can be done?

1. Complex captcha

The captcha solvers simply write down the text from the captcha image. If your captcha requires something more than just writing text, their service won't work.

Here is an example where you have to solve an easy puzzle in order to proceed:

Image recognition might also work.

For example a question: "How many circles are on the image?" for an image with different geometric shapes on it. The questions can randomly change to include squares, triangles and etc. instead of circles and the images can differ as well.

2. Flash Captcha

The Flash captchas aren't in the supported format for most captcha services. They will hardly be able to do anything especially when the captchas require some sort of user interaction to appear or to be read.

Examples:

An interactive captcha where you have to put the animals on the ground

And a simple flash captcha

Anti-spam questions:

1. Math Questions

Who calculates better, a human or a machine?

The only thing the bot has to do is to transform the text of the question into a mathematical expression and the answer is ready. To make the users answer with letters, not with numbers, can lead to problems with spelling mistakes and won't necesarrily stop the bots from using an in-built dictionary ( 1 - one, 2 - two and etc).

2. Questions with String operations

Examples:

Question: Type in the middle four letters of "woeforbots"

Answer: forb

Question: Type 'noitseuq' in reverse.

Answer: question

These might work for a time, until the bot makers implement in the bots the terms you use and what you mean by them (middle X letters, first X letters, last X letters, reverse and etc.)

3. Logical questions and other types

Example:

Question: Identity which item is Yellow - RoadBananaRedHouse

Answer: Banana

When bots encounter a question that requires some type of knowledge, they might try to google it.

If the question is too special and the answer doesn't easily appear in the search results, then brute force is the only way they can pass. By brute force I mean the bot programmers trying to make every question you have, appear and then include the answer in the bot.

The more random questions of this type you have, the harder it will be for them to include all the answers in the bot. Many bot makers will give up when they see more than ten random logical questions.

A good practice will be to change your questions from time to time or to add new ones.

Email Confirmation

Now, I will try to be short about this.

POP3 exists and bots can use it.

Yes, bots can confirm emails.

If you are ready to make your human users wait for more than 10 minutes to receive the email (on purpose), then if you are lucky the bot makers might decide your website isn't worth the time and move on.

Some people say solving captchas takes time and good, honest users don't deserve going through this.

What about going to their email acount, writing user and password, waiting the emails to load and then, in some cases wait for several minutes to get the email and confirm it?

It's good that email confirmation is only for the registration.

Other methods

If we talk about checking for spam in posted text (not just preventing bot registration) you can try to get familiar with Akismet, Mollom and SBlam!. All of them analyze user-submitted data and flag spam automatically. Mollom sometimes presents a captcha, but only when it’s unsure.

The best option is to develop your own system that is tuned to the mechanics of your website and can prevent spam bots from posting and registering.

Here are some popular methods.

The Honeypot Method

In 2007, Phil Haack suggested a clever method of detecting bots by using a honeypot.

The idea behind the honeypot method is simple. Website forms would include an additional field that is hidden to users.

Spam robots won't notice the CSS making the field invisible and since it will look like an important field, they will put a value in it and trigger a trap. If data is inserted into this “honeypot,” the website administrator could be certain that it was not done by a genuine user and should act correspondingly.

The honeypot method can be made more sophisticated by using JavaScript and data hashing. These obfuscation methods are not hack-proof, but we can assume that robots are not sophisticated enough to enter the required information.

JavaScript can be used to fill in hidden fields dynamically, which server-side validation can check for.

Additional timestamp and session data checks can also be used to detect automated submissions. A discussion on Stack Overflow provides many examples and ideas about this, including the implementation of Hashcash, which is available as a WordPress plug-in. A jQuery tutorial explains a similar method.

Just like with captchas, the method used does not stop intruders so much as the presence of any hurdle at all. As mentioned, spammers currently have too many targets to bother searching for a back door.

Recording User Time Expenditure

Another rather simple method that can be implemented without annoying users is to distinguish between users and bots by measuring the time they take to fill out a registration form or compose a comment.

By estimating the average time spent on a comment, one could define certain rules.

For example, if a submission takes less than five seconds, which is virtually impossible for a human but just enough time for a bot to do its job, you could ask the user to try again. Jack Born’s tutorial on a slight variation of this concept for jQuery is worth a peek, since most users have JavaScript enabled.

The whole endeavor is based on one crucial assumption - spammers prefer going after the easiest targets and will leave a website untouched if their initial attempt fails (although this can never be guaranteed).

Manual comment approval

If you don't wish to delay your users with interactive captchas, manual approval might be a good way to stop spam.

We don't need to mention the increasing number of human spammers hired for a few dollars who can pass through any bot defense. Taking responsibility and removing the burden from users will improve their interactions with and impressions of your website. Manually moderating content is often a sacrifice worth making.

Conclusion

If you wish to protect your website from bots searching for easy prey, you can try the honeypot method or user time expenditure.

They will probably move on when they encounter difficulties. In case your website is a special target of a bot programmer not just another number, then it's best to use interactive captchas.

If you wish to protect your comments from human spammers, you can use Akismet, Mollom, SBlam!, etc. and as mentioned above - manual comment approval is a very good option if you have the time.

What kind of techniques do you use to protect your sites from spam? Share in the comments below and thank us with a tweet or G+ recommendation!