Several months ago, the well-known political analysis blog FiveThirtyEight released an article unambiguously titled: “Primary Turnout Means Nothing for the General Election.”

The article sought to deflate republican optimism by dispelling the notion that high Republican primary turnout was evidence that Trump would enjoy greater success in the general election. The point was put succinctly by the pundits at FiveThirtyEight: “history suggests that there is no relationship between primary turnout and the general election outcome.”

Ironically, the evidence the pundits present do not reveal the absence of a relationship, so much the appearance of a weak correlation. Furthermore, in light of the fact that their attempted refutation is based on nothing more than six data points, their overly dismissive conclusions come off as a tad overconfident.

All things considered, there’s a difference between asking “can A cause B,” and “does recent variation in A explain variation in B.” For example, there’s no statistical connection between cyanide poisoning, and a state’s overall death rate. This does not mean that cyanide is harmless. It means that cyanide poisonings cannot explain why death rates were high in some states, yet low in others, in recent years. By the same token, just because primary turnout wasn’t decisive in every past election, doesn’t mean that it won’t prove decisive in the future.

The basic problem with FiveThirtyEight’s reasoning is that it fails to consider the following; a causal relationship between primary and general election turnout could exist at the state level, despite being less firm national level.

As we shall see, there are several reasons to think this is so:

Reason 1: States Fluctuate Between Hosting Primaries and Caucuses

Primary turnout at the national level is calculated by adding up all votes cast for Republicans and Democrats in state-run primaries. Caucus results, by contrast, are not included in national measures of primary turnout.

Since states vary in their tendency to hold primaries or caucus each election, national comparisons of the total number of primary votes cast in various years are not apples to apples comparisons. If we compare national primary turnout in one year, to turnout in another, we’re effectively comparing results from two different sets of states, when we should be comparing results from the same set of states in different years. The end result is that each election, primary support is overestimated for one party, and underestimated for the other.

During the last four competitive presidential primaries (1992, 2000, 2008, and 2016), 12 states (24%) did not consistently hold primaries or caucuses, but oscillated between holding primaries, caucuses, and no election whatsoever. Given that these states collectively possess 81 electoral votes, failing to include these states by excluding caucus results could easily throw off a correlation between a handful of data points.

Reason 2: Political Parties Fluctuate Between Hosting Primaries and Caucus

People often assume that states can be easily divided into “caucus states” and “primary states.” The idea is that caucus states hold caucuses for both parties, whereas primary states do the same thing for primaries. However, like most things in life, the reality is more complicated.

Broadly speaking, state branches of the two major parties often hold different kinds of elections within the same state and election cycle. For instance, in 2000, Virginia Republicans held a primary whereas Virginia Democrats held a caucus. In 2016, Kentucky Democrats opted for a primary whereas Kentucky Republicans held a caucus.

If one party holds a primary, and if the other holds a caucus (or fails to hold an election), then this will systematically underrepresent the true level of support for whichever party opted not to run a primary.

During the last four competitive presidential primaries (1992, 2000, 2008, and 2016), the Republican and Democratic parties of 11 states decided to hold a primary or caucus when their rivals did not, on at least one occasion. Since these 11 states have no fewer than 99 electoral votes, failure to include election results from these states could artificially depress correlations between a small number of data points.

Reason 3: Primary Turnout Only Matters for Swing States

The United States does not have a single election for president. Instead, the outcome of fifty-one smaller elections determines the winner of the executive branch. As far as US elections are concerned, the relevant unit of analysis begins with states.

This is why experts and laymen alike spend so much time discussing “battleground states,” as well as the various strategies used by candidates to achieve victory in those states. Furthermore, most people don’t even bother discuss the vast majority of state elections. This is because most states are dominated so severely by a single party, that the outcome of the vote is practically a foregone conclusion.

The basic error of the FiveThirtyEight pundits stems from their failure to examine primary and caucus turnout at the state level. In particular, they fail to consider that state primary turnout could cause an increase (or decrease) in general election turnout, while being insufficient to change the outcome of any given national election. This incongruity could result either from how the votes are counted (i.e. the electoral college), or, it could be the result of other factors.

For the sake of argument, let us assume that if a party experiences higher turnout in a state primary, that it will also experience higher turnout in the general election (all else being equal).

If these assumptions are true, are there any realistic ways in which a party could lose the general election despite receiving more primary votes? Yes there certainly are:

(a) Increased primary turnout could be concentrated in a state the party would win anyway (ex. a GOP candidate who really turned out support in Alabama) (b) A modest increase in primary turnout could be concentrated in enemy strongholds which they aren’t going to win anyway (ex. a Dem candidate who loses slightly less badly in Tennessee still gets zero electoral votes) (c) The opposing party could experience an equivalent increase in primary turnout (d) The opposing party could experience a greater increase in primary turnout

Any combination of the following factors could prevent increase primary turnout from giving a party an edge in the general election.

The Effects of Primary Turnout on the General Election

Even if the establishment view of primary turnout is deeply flawed, no evidence has been put forward to substantiate the claim that primary turnout is associated with general election turnout… until now.

After analyzing results from the Federal Election Commission, and, after conducting my own independent investigation into results of competitive primaries and caucuses, I found several lines of evidence supporting a connection between primary turnout and general election performance. At the state level I found that:

1. Republican and Democratic primary performance was linked to higher turnout in the General Election

2. Changes in Republican and Democrat primary turnout were linked to changes in General Election turnout.

3. Republican and Democrat primary turnout was associated with their respective electoral victories in the General Election

4. Republican primary turnout predicts general election performance after controlling for Democratic primary performance, and, after controlling for whether a state held a primary or caucus for either party.

These findings are displayed here:

Why This Bodes Well for Trump

In 2016, the share of eligible voters who voted in Republican primaries was the highest on record.

During the 2016 primaries, not only was there a significant decline in Democratic primary turnout from its peak in 2008, but this year’s performance only ranked 5th out of the last 10 elections.

Between 2008 and 2016, Republican primary and caucus turnout per 100,000 eligible voters rose in 87% of states. Republican turnout increased in 97% of Red States and in 79% of blue states. If equal weight is assigned to every state, Republican primary turnout at the state level rose by an average of 54%. Republican turnout was up 65% in Red States, and 46% in Blue States. Nationwide, Republican primary turnout was up 51%.

By contrast, the share of eligible voters who voted in Democratic primaries decreased in 88% of states between 2008 and 2016. Democratic turnout declined in 92% of Red States and in 83% of Blue States. If equal weight is assigned to each state, Democratic primary turnout declined by an average of 23%. Nationwide, turnout was down 35%.

If we look at current swing states, we see an encouraging pattern for Mr. Trump. The average increase in Republican primary turnout among fourteen swing states totaled 56%. On the other hand, Democratic turnout in these states declined by an average of 16%. It should be noted that all figures in this section were adjusted for the size of the voter eligible population:

These findings are very significant. Not only do they suggest that primary turnout influences the general election in the way one might expect, but it also lends credence to the idea that the polls may be systematically underestimating Trump’s true level of support.

As other commenters have noted, modern pollsters do not attempt to randomly sample the population. Since people are generally unwilling to fill out surveys, there is good reason for this. Instead, Pollsters attempt to garner large and diverse sample of participants, and to weight their surveys according to who they believe will vote.

Herein lies the problem: there’s a fair degree of uncertainty surrounding who will vote in the 2016 general election. We simply do not know who will vote. Pollsters attempt to correct for this by weighting their sample according how likely various groups voted in previous elections. The problem is, there’s reason to believe that Trump enjoys high degree of support from people who do usually do NOT vote.

As such, pollsters may be systematically undercounting Trump supporters while granting undue weight to other groups. The fact that Republican primary turnout has surged throughout the country as Democratic turnout continues to wane… lends credence to the idea that Trump’s support may run deeper than the polls suggest. The fact that there has been such a dramatic shift in primary turnout in swing states gives us reason to think that the polls may be underestimating Trump’s level of support in the places that matter most.

Put simply, the data suggests that pollsters have failed to give ordinary Trump supporters the weight they truly deserve.

(1) One might argue that systematic mismeasurement of a party’s level of primary support could be prevented by including results from caucuses in with primary totals. However, while this solution may seem sensible, it is far easier said than done. Although the Federal Election Commission records primary turnout, it does not record caucus results. I’ve also been unable to find any official source that describes caucus turnout by state, party, and election year. The best I’ve been able to come up with is a few unofficial guesstimates of state caucus turnout from local media and party officials (yet even these are few and far between, especially pre-2008).

(2) Although my results were reliable (i.e. displayed the same broad pattern across several different elections), this fact does not imply that the data are without limitations. The correlations and regression coefficients displayed in this article were obtained by averaging analogous models from previous elections. The underlying models not only had large confidence intervals, but they were also unable account for most of the state-level variation in general election turnout. Designing a model whose predictions a reasonable person would be confident in… would require adding additional factors to our model (e.g. a state’s past voting record, demographics, etc.). It would also also require transforming some of the data in order to make it more consistent with the underlying assumptions of linear regression.

While my findings generally support an optimistic outlook on Trump’s general election performance, I’d strongly advise against using the models presented here as a basis for making specific predictions about specific battleground states.

(3) The bulk of the data is from the Federal Election Commission (1992-2012). An excel file containing the data as well as the original source material can be found here.

(4) McDonald, M., & United States Elections Project. (2016). Voter Turnout. Retrieved September 15, 2016.

(5) DeSilver, D., & Pew Research Center. (2016, June 10). Turnout was high in the 2016 primary season, but just short of 2008 record. Retrieved September 14, 2016.