After another shocking win for Bernie Sanders in Indiana, many in mainstream media are scratching their collective heads on why polling data showed Clinton winning the state. Let me explain how the media’s polling data is flawed & how in head-to-head general election match-ups, Bernie Sanders destroys Trump, while Clinton vs. Trump results in a 50–50 coin toss.

Why Is Polling Data Wrong This Election Cycle?

For the Indiana democratic primary, every single poll showed Clinton winning. Even the political data junkies at FiveThirtyEight gave Clinton a 90% probability of winning.

The reason everyone was wrong is simple: they weren’t using honest data. By honest, I mean data that accurately reflects what it is measuring. There are two major reasons the polls keep getting it wrong:

Independent Voters — this group now represents around 42% of all voters, making it the largest group of voters, by far. Every poll I’ve studied under-indexes Independent voters in their data by a large margin. Young Voters & Revitalized Voters — many polls either don’t call cell phones (young people don’t typically have landlines) or over-index landlines. Whether it be Census data or a reliance historical geographic voting data, the polls also seem to have trouble finding the voters who sit on the sideline until an inspiring candidate comes along.

In stark contrast to New York, Indiana ran an open primary, allowing both young voters and Independent voters to participate fully. Guess what other election allows young and Independent voters to participate? That’s right, the General Election. So that begs the question…

Who Can Defeat Trump?

Back in March I analyzed Reuters data to see who does better against Trump in head-to-head match-ups. Bernie came out on top, while Clinton lost to Trump.

Note: Right after my post spread throughout social media, Reuters abruptly stopped recording Sanders data & I called them out on it. I trust Reuters data, but am skeptical of Thomson Reuters multi-million dollar ties to Clinton Foundation.

When I did my analysis, I was smart enough to point out how Independents were the primary reason why Sanders & Trump did so well. However, I didn’t account for data mistake #1 above (under-indexing of Independents). Reuters data is collected online, so it addressed #2, but in regards to #1, I recently discovered Reuters mix of Republicans, Democrats, and Independents was off by a factor of 4:

On the left is the Reuters polling breakdown. On the right is the true mix of today’s political landscape.

In Reuters defense, all the other polls fail to include the correct percentage for Independents as well. Many even exclude non-party polling (meaning only registered Democrats can take the Democratic candidates poll).

Readdressing Head-to-Head Match-up Between Trump & the Democratic Candidates

Now that Reuters finally started tracking Sanders vs Trump again, I wanted to take a fresh look at the head-to-head match-ups, plus I wanted to correct the bias against Independent voters. Here are the uncorrected charts:

Hillary Clinton vs Trump: uncorrected data

Source: Reuters Polling, Clinton vs Trump, Filter: Registered Voters

As you can see, the chart moves around, but we are now in the 5th period where Clinton’s lead is in the margin of error (April 29th un-adjusted lead is only 4.3%).

Bernie Sanders vs Trump: uncorrected data

Source: Reuters Polling, Sanders vs Trump, Filter: Registered Voters

You can see the gap when Reuters stopped tracking. Unfortunately, they may have stopped again because the data is 5 days behind Clinton vs Trump data. You may also notice that Bernie Sanders has held a very comfortable lead throughout (currently a 16.8% uncorrected lead).

Corrected data

Now let’s fix the mix of respondents by adding more weight to the Independents voice and less to the Democratic & Republican respondents. Again, the mix should be: 42% Independent, 29% Democrat, and 26% Republican.

Here’s a peek of the spreadsheet I created to recalculate the appropriate amounts (Reuters lets you filter by party):

Hillary Clinton vs Trump — Fixing the Independent Vote %

In the Registered voters column we see the original Reuters-reported result of Clinton winning 42.7% to 38.4%. But once adjusted, Clinton’s win slims down to a razor-thin 37.3% to 37.0%. Those numbers may seem a little low in a two-candidate race, so I will later fix the non-vote #s to match what we are used to seeing on election day. Here’s the Bernie vs Trump data:

Bernie Sanders vs Trump — Fixing the Independent Vote %

By taking into account the correct Independent vote %, Bernie goes from a crushing 49.2% to 32.4% win to a commanding 44.9% to 33.2% win.

General Election Results

As I alluded to above, the numbers may seem low in a two-candidate race. That’s because Reuters includes non-votes in the head-to-head match-ups (marked as “neither” or “I won’t vote”). But that’s not how elections work. Only those who vote will get their vote counted. So, taking the findings above, I can remove the non-voters to get election day-like results.

Election Day Results based off fixed, more accurate Reuters data:

There you have it. The most accurate data shows us Clinton vs Trump is basically a coin-flip, while Bernie Sanders dominates Trump in a landslide victory.

The data above may be understated because it doesn’t take into account recent polls showing Trump beating Clinton outright, nor non-quantitative information like debate match-ups (I believe Trump will attack Clinton’s very long list of controversies. There’s also the possibility of FBI criminal charges on her classified emails or Clinton Foundation foreign funding conflicts of interest, which would surely hurt her numbers. Next I’d like to find state-by-state data to create an electoral map estimate, but I expect the results to be just as pronounced.