In 1997 chess master Gary Kasparov went to battle against the IBM supercomputer Deep Blue in a landmark match. After six games Deep Blue prevailed, marking the first time that a computer defeated a reigning world champion under tournament conditions.

But chess isn’t the only game in town.

A couple weeks ago, an artificial intelligence once again squared off against world-class human gamers. This time—at the Brains vs. Artificial Intelligence challenge at the Rivers Casino in Pittsburgh—the ultimate supremacy of human or machine was determined not by chess but by an epic 14 days and 80,000 hands of no-limit Texas hold ‘em. That’s right: The newest battlefield in the War Against the Machines is the poker table.

Representing the Machines: Claudico, an AI from the same lab at Carnegie Mellon University that gave birth to Deep Blue. Fighting for the Users: Jason Les, Dong Kim, Bjorn Li, and Doug Polk, four of the world’s best professional poker players. The tournament was the first time any program had competed in no-limit Texas hold ‘em against human beings.

It’s a game of particular interest to AI researchers. Of all the poker variations, no-limit hold ‘em is one of the most sophisticated. Each player gets two cards only he or she can see. There’s a round of betting, and then a dealer presents five cards available to all players—three cards (the flop), one card (the turn), and then the last card (the river)—with a round of betting after each. In limit hold ‘em players can only bet in fixed increments, but in no-limit, anyone can bet any amount, from one chip to going “all in,” betting everything. You can leverage a strong hand to extract more value from your opponent, or bluff with a weak hand to increase the value of losing cards. It’s hard.

So hard, in fact, that AI researchers have been looking at poker since the 1990s. Today it’s the most important benchmark in the field. Unlike chess, poker is a game of incomplete information—no player has all the available data. An algorithm capable of determining optimal strategy for incomplete information scenarios could have applications for cybersecurity, medicine, and military strategy. “Most real world settings are imperfect information games,” says Tuomas Sandholm, whose team designed Claudico. “You don’t know exactly what the state of the world is because you don’t know everybody else’s private information.”

Even better, computers have already solved most of the simpler problems. No-limit hold ‘em is the last big challenge. Sandholm estimates that the number of unique situations that can arise in a game is greater than the number of atoms in the universe—squared. “The game is so big that you can’t even fit it into memory,” he says.

Microsoft Research and Rivers Casino put up $100,000 to cover the players’ appearance fees and to make the grueling 13 hours a day of play a bit more appealing. The team from Carnegie Mellon structured the challenge so that Claudico would simultaneously play each human one-on-one over a large sample size of twenty thousand hands, with the winner decided by who had the most chips (no actual money at stake) after 80,000 hands, the AI or the humans. Place your bets.

Computing Poker

Sandholm and his team approached Claudico’s development in three stages. First they fed the rules of no-limit hold ‘em into an abstraction algorithm, reducing the game to something smaller in scope and more easily comprehensible. They then customized algorithms that attempt to come as close as possible to Nash Equilibrium, a game theory concept involving the adoption of optimal strategy. Finally, the team used reverse mapping techniques to input that strategy back into the algorithms for the game’s original parameters.

As a player, Claudico rarely falls into a recognizable pattern. That, along with a variety of unorthodox bet sizes, gives the machine a distinct advantage over humans. “Typically humans use one or two bet sizes, because they are worried that they are going to signal too much about their own private cards,” says Sandholm. “Claudico’s reasoning guarantees that it’s balanced.”

On the other hand, no-limit poker takes an enormous amount of computational power. So Claudico’s programmers couldn’t generate algorithms that solved every problem. “We run into this classic artificial intelligence tradeoff of solution quality versus reasoning time,” explains Sandholm. “We don’t have infinite time, and therefore we have to make some compromises in how we reason.” Claudico can only get close to Nash Equilibrium; it doesn’t react to the specific tendencies of individual opponents. The machine instead approximates ideal rational play, no matter the circumstances.

The Human Factor

In some ways, Claudico’s approach is something human players can only aspire to. “If you are playing game theory optimal, you are indifferent to how your opponent plays,” says Jason Les, 29, one of the pros who played in the tournament. “Your strategy will, at worst case, break even.” Les still thought he had an edge going in. He just didn’t know how it would manifest itself. “I really didn’t know what to expect,” he says. “I understood there would be some frequency of the time where this bot was amazing and we had no chance of winning.”

When the competition began, Les was struck by the unique and finely calibrated nature of the AI’s betting scheme. “It uses a mixed strategy. It will do multiple things with the same hand,” says Les. Even the best human players eventually leave traces of an identifiable pattern in their betting behavior, which can then be used by savvy opponents to more accurately gauge the value of their two hole cards. Not Claudico. “It has all that perfectly balanced and randomized,” Les says—with perhaps a trace of awe.

So the professionals adopted a constantly changing, exploitative strategy designed to locate and attack specific quirks in Claudico’s play. For example, it couldn’t process card removal—the way in which the cards in one’s own hand affect the likelihood of another player having specific card combinations. Les says that Claudico didn’t factor that in, so the humans could tell when the AI was making big bets to disguise a weak hand, trying to force its opponent to fold.

That tell meant Les and his colleagues could pick off gigantic bluffs on the river by calculating that their hole cards made it unlikely Claudico had a hand as big as its bet would suggest. “It was writing a check it can’t quite cash,” says Les.

Another chink in the AI’s armor was the way it responded to bet sizes from its competitors. In an effort to reduce the size of the "game space" Claudico had to traverse in its search for solutions, the developers limited the number of bet sizes the program would recognize. If Claudico had no data for a bet of half the size of the pot in a given hand, some percentage of the time Claudico would react to such a wager as if it were a bet of three quarters of the pot, and some percentage of the time it would react to it as if it were a bet of one quarter. That’s a big problem; it meant that the AI didn't always responding correctly. The humans capitalized on that. “Bjorn started using the most unusual bet sizes,” Les says. “He was falling in between the known sizes a lot, and was causing Claudico to have difficulties.”

Judgment Day

In the end, the ability to exploit Claudico’s departures from optimal play carried the humans to victory. When the final hand of the competition was completed, the players had wagered around $170 million (theoretically), and the team of humans professionals was ahead $732,713.

Sandholm doesn’t count it as a loss, though. He says that because the outcome didn’t statistically have a 95 percent confidence interval, it was essentially a tie.

Not everyone agrees. Les and his fellow human poker players think the final dollar count is a pretty clear indicator of who won. So does at least one other AI expert. “The margin of victory was substantial in poker terms,” says Michael Bowling, one of the creators of another poker-playing bot, Cepheus.

Still, both the computer scientists and the poker professionals agree that the outcome shows just how fast AI is advancing. It took eight years and a couple attempts for Deep Blue to triumph over Kasparov. By the time computers began to dominate in chess, research in that field had been underway for nearly four decades. Compared to all that, the night is still young for poker. “While humans may still be ahead for now,” says Bowling, “it’s really just the beginning of the end.”

In other words: They’ll be back.