Dreber's experiment was born in a bar. Over drinks with her husband Johan Almenberg and roommate Thomas Pfeiffer, she was talking about an attention-grabbing psychological study that she thought was “cute, but unlikely to be true.” When she wondered how good her instincts were, Pfeiffer brought up another paper by economist Robin Hanson at George Mason University. Titled Could Gambling Save Science?, it suggested that researchers could get a more honest consensus on scientific controversies by betting on their outcomes, in the way that traders bet on the future prices of goods.

“It blew us all away,” says Dreber. In 2012, she and her colleagues contacted Nosek, who agreed to add prediction markets to his big Reproducibility Project.

Here's how it worked. Each of the 92 participants received $100 for buying or selling stocks on 41 studies that were in the process of being replicated. At the start of the trading window, each stock cost $0.50. If the study replicated successfully, they would get $1. If it didn't, they'd get nothing. As time went by, the market prices for the studies rose and fell depending on how much the traders bought or sold.

The participants tried to maximize their profits by betting on studies they thought would pan out, and they could see the collective decisions of their peers in real time. The final price of the stocks, at the end of two-week experiment, reflected the probability that each study would be successfully replicated, as determined by the collective actions of the traders. If it was $0.83, that meant the market predicted an 83 percent chance of replication success. If that final price was over $0.50, Dreber's team considered it to be a prediction of success; if it was under, it was a prediction of failure.

In the end, the markets correctly predicted the outcomes of 71 percent of the replications—a statistically significant, if not mind-blowing score. Then again, based on the final prices, the team only expected the markets to be right just 69 percent of the time—which they roughly were. (Remember that those prices are probabilities of success, so they naturally contain uncertainties about their own predictions.)

“There is some wisdom of crowds; people have some intuition about which results are true and which are not,” says Dreber. “Which makes me wonder: What's going on with peer review? If people know which results are really not likely to be real, why are they allowing them to be published?”

Well, says Nosek, market participants only care about whether the study will replicate, while reviewers are also looking at experimental design, importance, interest, and other factors. Also, reviewers, by their nature, work alone, and Dreber's traders performed poorly when working solo. When Dreber actually asked them to predict the replication odds for each study, they were right just 58 percent of the time—no better than chance. Collectively, they became more effective because they could see what their peers were thinking.