Earlier this month at BuzzFeed News, we announced we’d be grading this year’s election forecasts. On Monday afternoon, Michigan’s Board of Canvassers finally certified Trump’s win. Now that every state has finally been called — assuming that Jill Stein’s recount effort doesn’t change things — we have the results.

Yes, the polls were wrong. But some forecasters, who typically rely on polls and often combine them with other data to give odds on who will win, were less wrong than others. It doesn’t take fancy math to determine that Nate Silver’s FiveThirtyEight forecasts, although they gave Hillary Clinton better odds than they did Trump, were the least wrong. Not only did he give Trump more than a 1-in-4 chance to win the election, but he also repeatedly defended his forecasts’ bullishness on Trump, for reasons that later proved prescient. Other high-profile forecasts gave Trump small-to-vanishing odds.

To recap, here are the forecasts we examined — listed by the likelihood Trump would beat or tie Clinton in the Electoral College:

These prognostications are based on how each of the 50 states and the District of Columbia were expected to vote. We can get a better sense — though still imperfect — of the forecasters’ judgment by looking at how they predicted the individual state races.

The basics boil down to this: Which forecasters got the most calls right? Which came closest to the final margins between Clinton and Trump? Which forecasters best balanced confidence and correctness? (For example: Did they give Clinton a 99% chance of winning Michigan or an 80% chance? And were they right?)

The simple approach to judging complicated forecasts

The simplest approach is just to count the number of states each forecaster called correctly. But that misses critical nuance. For example, it doesn’t take into account the difference between 51% odds and 99% odds. Still, it’s an easy place to start.

The Princeton Election Consortium’s Sam Wang guessed more states correctly than anyone else we examined: 46 plus the District of Columbia. Silver’s FiveThirtyEight, and almost every other forecast, got 45 correct. (Wang, unlike the other forecasters, thought Trump would win North Carolina.)

This highlights a paradox: Despite his misplaced hyperconfidence that Clinton would take the White House — giving her a 99% chance — Wang guessed more states “right” than anyone else.

So why did he think Trump had such a small chance of winning? Not only did state polls fail, but most failed in the same way: underestimating Trump. This is called “correlated error.” All forecasters realize that polls aren’t perfect; sometimes they interview the wrong people or weight their responses incorrectly. But part of a forecaster’s job is to estimate how likely — and how extensive — correlated error could be. Many forecasters considered that prospect unlikely, but Nate Silver didn’t.

In his post-election mea culpa, Wang pinpointed this mistake as his forecast’s Achilles' heel. “I did not correctly estimate the size of the correlated error — by a factor of five,” he wrote. “Polls failed, and I amplified that failure.”

A more nuanced approach

Ready for a bit more math? A metric called the Brier score is widely used to quantify forecasters’ accuracy — in elections and beyond. (It’s the main metric we said we’d use for grading. We've posted the data and code behind these calculations on GitHub.)

Brier scores take into account just two things: How likely did the forecaster think something would happen, and did it? Brier scores reward confidence when you’re correct, but penalize confidence when you’re wrong.

Smaller scores are better. Zero is the best possible score — it means you were 100% confident in your predictions, and they all came right. The worst possible score is 1 — you were 100% confident in your predictions, and they all were wrong.

Below are two types of Brier scores for each forecast. The first is weighted by each state’s electoral votes, so that Pennsylvania (20 votes) counts five times as much as New Hampshire (4 votes). The second counts each state equally: