Overnight that night, and for the two following nights, saboteurs hiding in UCSD’s crowd went to work. At first, the attackers just scattered pieces that had already been correctly assembled, like a child petulantly trashing a half-finished jigsaw. Then the attacks became more sophisticated, exploiting bugs in the team’s code to pile hundreds of chads one on top of another, or moving important pieces far off the virtual mat where they couldn’t be seen.

An army of genuine users valiantly tried to repair the damage but the attackers seemed too numerous and too fast. Not once, but twice the group was forced to reset the puzzle to a previously saved configuration.

“Our first response was ‘Oh crap!’ Then we looked in the database for patterns of destruction, and rolled everything back to before that,” remembers Lian. As the attacks continued, the team tried blocking individual accounts they suspected of being malicious, and then whole IP addresses to contain the destruction. “I lost five kilos doing this Challenge,” says Cebrian. “I got really sick. We were working without sleep for days in a row.”

On November 24, an email from an anonymous Hushmail address landed in the team’s inbox. It taunted UCSD about the team’s security lapses, claimed that the sender had recruited his own horde of hackers from the notorious 4chan bulletin board, and revealed exactly how he had used proxy servers and virtual private networks (VPNs) to launch his attacks.

“I too am working on the puzzle and feel that crowdsourcing is basically cheating,” read the email. “For what should be a programming challenge about computer vision algorithms, crowdsourcing really just seems like a brute force and ugly plan of attack, even if it is effective (which I guess remains to be seen).” He signed off with the phrase “All Your Shreds are Belong to U.S.”

That was the jokey name of the team then in first place. Its leader, an experienced coder and inventor named Otavio Good, vehemently denied responsibility for the attacks. And the San Francisco-based Shreds team did seem legit: it was using custom computer vision algorithms to solve the puzzles, with humans double-checking the software’s work.

But paranoia reigned at UCSD. “We looked at the Shreds team members and wondered, is this person capable of sabotage? Or this one?” says Lian. He even tried to geo-locate their IP addresses to see where they lived. Nothing led back to the attacker. Meanwhile, the team was desperately trying to shut the stable door: changing the interface to permit only one move every 30 seconds, preventing pieces from being stackable, and making registration mandatory. There was also a plan to develop a reputation system, where only the best performing users would be allowed to contribute to the puzzle. Nothing helped.

Hundreds of users melted away before the team’s eyes, and those that remained were disorganized and demoralized. Not a single new productive player joined

the UCSD effort following the attacks.

Overall, their crowd was only two thirds as efficient as it was before, and nearly ten times slower to recover. A week later, on December 1, All Your Shreds are Belong to U.S. completed the fifth and final document to claim DARPA’s $50,000 prize.

The identity of the attacker remained a secret. Cebrian vowed to continue to investigate the sabotage. But he was doubtful that his quest would succeed. “We will probably never know the true story about this,” he said then.

The data detective

That would have likely have been true, if not for a young French data scientist called Nicolas Stefanovitch. In 2011, Stefanovitch was half a world away from the Shredder Challenge, teaching computer science at Dauphine University in Paris. Two years later, and now a post-doc researcher in Abu Dhabi, a fascinating dataset arrived from Cebrian in Australia: the log-in and move tables from UCSD’s Shredder Challenge. The tables contained a complete record of the position and movement of each of the thousands of puzzle pieces, who had moved them, and the IP addresses they had used; over 300,000 entries in all.

Just as the Challenge teams had reassembled documents from a mess of tiny shreds, Cebrian asked Stefanovitch to painstakingly recreate the contest itself, hunting through a haystack of genuine users for the telltale pinpricks of those who wanted to unravel the crowd’s best efforts. Unlike UCSD’s legions, though, Stefanovitch was a crowd of one.

After a month of crunching the numbers, Stefanovitch was getting nowhere. With so many users working on the puzzle simultaneously, it was proving impossible to distinguish attacks from normal gameplay. Then he had a thought: if the shredded documents were a problem in vision, perhaps the attacks could be solved the same way? Stefanovitch animated the data, ignoring the content of the shreds themselves but plotting their movements over time.

When the first animation ran, he knew he was on to something. Dozens of likely attackers jumped off his laptop screen. These users either placed and removed chads seemingly at random, or moved pieces rapidly around the board. It was hardly surprising that the UCSD researchers believed they were under attack from a large group. But Stefanovitch was still a long way from a solution. “It was super hard to determine who was a saboteur,” he says. “Most of the people who looked like attackers, were not.”

Many of the high-speed moves turned out to be from genuine players responding to attacks, whereas others were just the actions of inept puzzlers. A few assaults were so rapid, however, that Stefanovitch thought the saboteurs might have deployed specialized software attack tools.

Stefanovitch set about identifying features — unique characteristics in the data — that he could match with behaviors on the board. He ended up with 15 features to separate saboteurs from honest users, and slowly honed in on those whose actions were destructive. There were far fewer than anyone had suspected: less than two dozen email addresses.

“I found a peak in recruitment that corresponds almost exactly to when the attacker claims he made an announcement on 4chan,” says Stefanovitch. “But I detected only a very small scale attack at this time, an attack so tiny you couldn’t even see it if you didn’t know it was there.”

Stefanovitch speculates that any 4chan hackers who logged on to wreak havoc soon got bored. “They might have been attackers but they weren’t motivated; they had nothing to gain from scorching our puzzle.”

Once he had eliminated the 4chan wave, Stefanovitch could identify the hardcore attackers. He then tracked their behavior forward and backward through time. When he re-watched his simulation of the very first attack, he struck gold. The initial assault was a sluggish affair, about ten times slower than subsequent hacks, as though the saboteur was still feeling out the system’s weaknesses. “When he realized he maybe could be traced, he logged off. Twenty minutes later, he logged in again with a different email address and continued doing the same stuff,” remembers Stefanovitch.

Crucially for Stefanovitch, the attacker had left his digital fingerprints on the system. When he logged in again from the same IP address, Stefanovitch was able to associate the two email accounts. As the attacks accelerated, the team in San Diego banned the attacker’s usernames. He, in turn, opened a stream of webmail accounts, eventually leading UCSD to block his IP address. The attacker then hijacked a neighbor’s wifi router and used a VPN to log in from different IPs. Yet he stumbled again, connecting from the new IPs with old, discredited usernames. No matter how many disposable emails the attacker now used, Stefanovitch could link them all back to him.

Three years after the Challenge, and after six months of solid work, Stefanovitch was finally able to sketch out a map of email addresses and IPs that covered all the destructive accounts.

He had solved the first-ever documented attack on a deployed crowdsourcing system. And the results were terrifying.

By Stefanovitch’s reckoning, just two individuals had accounted for almost all the destruction, eviscerating the completed puzzle in about one percent of the moves and two percent of the time it had taken a crowd of thousands to assemble it. Yet the attacker had left one more clue, a blunder that pointed right back to his door. During the first attack, he had logged in with an email address from his very own domain.

Inside job

Late last year, Stefanovitch and Cebrian collaborated on a paper about the Challenge. When I read it, I asked Stefanovitch whether he had tried contacting the attacker. “Tracing him was the most exciting aspect of the project, it felt like a thriller,” says Stefanovitch, who still had a few technical questions about the attacks. “But I was very busy so I just dropped it.”

He was, however, happy to share the attacker’s email with me. I got in touch with Adam and we finally spoke just before Christmas. It was a confusing experience at first. I found it hard to reconcile the softly spoken, modest voice on the phone with the high-octane firebrand I was expecting. Adam was thoughtful, even hesitant, choosing his words with care. But once we started talking about the Challenge, he gradually opened up.

Adam had first heard about the Shredder Challenge on a Reddit hacker thread, while working on character recognition and computer vision at a document imaging firm. “I had a little bit of background in that arena and decided to take a stab at it,” he told me. “My team, basically just me and a friend, was not super organized. We were having fun with it and didn’t really expect to win.”