The method Woot uses to combat this issue is changing the game - literally. When they present an extraordinarily desirable item for sale, they make users play a video game in order to order it.

Not only does that successfully combat bots (they can easily make minor changes to the game to avoid automatic players, or even provide a new game for each sale) but it also gives the impression to users of "winning" the desired item while slowing down the ordering process.

It still sells out very quickly, but I think that the solution is good - re-evaluating the problem and changing the parameters led to a successful strategy where strictly technical solutions simply didn't exist.

Your entire business model is based on "first come, first served." You can't do what the radio stations did (they no longer make the first caller the winner, they make the 5th or 20th or 13th caller the winner) - it doesn't match your primary feature.

No, there is no way to do this without changing the ordering experience for the real users.

Let's say you implement all these tactics. If I decide that this is important, I'll simply get 100 people to work with me, we'll build software to work on our 100 separate computers, and hit your site 20 times a second (5 seconds between accesses for each user/cookie/account/IP address).

You have two stages:

Watching front page Ordering

You can't put a captcha blocking #1 - that's going to lose real customers ("What? I have to solve a captcha each time I want to see the latest woot?!?").

So my little group watches, timed together so we get about 20 checks per second, and whoever sees the change first alerts all the others (automatically), who will load the front page once again, follow the order link, and perform the transaction (which may also happen automatically, unless you implement captcha and change it for every wootoff/boc).

You can put a captcha in front of #2, and while you're loathe to do it, that may be the only way to make sure that even if bots watch the front page, real users are getting the products.

But even with captcha my little band of 100 would still have a significant first mover advantage - and there's no way you can tell that we aren't humans. If you start timing our accesses, we'd just add some jitter. We could randomly select which computer was to refresh so the order of accesses changes constantly - but still looks enough like a human.

First, get rid of the simple bots

You need to have an adaptive firewall that will watch requests and if someone is doing the obvious stupid thing - refreshing more than once a second at the same IP then employ tactics to slow them down (drop packets, send back refused or 500 errors, etc).

This should significantly drop your traffic and alter the tactics the bot users employ.

Second, make the server blazingly fast.

You really don't want to hear this... but...

I think what you need is a fully custom solution from the bottom up.

You don't need to mess with TCP/IP stack, but you may need to develop a very, very, very fast custom server that is purpose built to correlate user connections and react appropriately to various attacks.

Apache, lighthttpd, etc are all great for being flexible, but you run a single purpose website, and you really need to be able to both do more than the current servers are capable of doing (both in handling traffic, and in appropriately combating bots).

By serving a largely static webpage (updates every 30 seconds or so) on a custom server you should not only be able to handle 10x the number of requests and traffic (because the server isn't doing anything other than getting the request, and reading the page from memory into the TCP/IP buffer) but it will also give you access to metrics that might help you slow down bots. For instance, by correlating IP addresses you can simply block more than one connection per second per IP. Humans can't go faster than that, and even people using the same NATed IP address will only infrequently be blocked. You'd want to do a slow block - leave the connection alone for a full second before officially terminating the session. This can feed into a firewall to give longer term blocks to especially egregious offenders.

But the reality is that no matter what you do, there's no way to tell a human apart from a bot when the bot is custom built by a human for a single purpose. The bot is merely a proxy for the human.

Conclusion

At the end of the day, you can't tell a human and a computer apart for watching the front page. You can stop bots at the ordering step, but the bot users still have a first mover advantage, and you still have a huge load to manage.

You can add blocks for the simple bots, which will raise the bar and fewer people with bother with it. That may be enough.

But without changing your basic model, you're out of luck. The best you can do is take care of the simple cases, make the server so fast regular users don't notice, and sell so many items that even if you have a few million bots, as many regular users as want them will get them.

You might consider setting up a honeypot and marking user accounts as bot users, but that will have a huge negative community backlash.

Every time I think of a "well, what about doing this..." I can always counter it with a suitable bot strategy.

Even if you make the front page a captcha to get to the ordering page ("This item's ordering button is blue with pink sparkles, somewhere on this page") the bots will simply open all the links on the page, and use whichever one comes back with an ordering page. That's just no way to win this.

Make the servers fast, put in a reCaptcha (the only one I've found that can't be easily fooled, but it's probably way too slow for your application) on the ordering page, and think about ways to change the model slightly so regular users have as good a chance as the bot users.

-Adam