In this era of polarized politics, a new divisive subject seems to have arisen — the analysis, or lack thereof, of votes cast before Election Day. The predictive stakes have been going up: In 1996, only 11% of the ballots were cast ahead of voting day, but in 2016, early votes amounted to 41% of the total vote. The upcoming and critical 2018 midterm elections may see an even larger share of early voters.

Anecdotally, this cycle has seen diminishing coverage and analysis of early votes. This is likely an overcorrection from the 2016 election, when media dedicated significant bandwidth to parsing the early vote and, in most cases, interpreted it as a sign that Hillary Clinton would win. If you saw me on TV around that time, odds are I was feeding that narrative.

Percent Voting Absentee, By Mail, or Early, 2004–16. Data includes all 50 states, Washington, D.C., American Samoa, Guam, Puerto Rico, and the U.S. Virgin Islands. Mail Ballots were not tracked in the EAVS until 2008. Chart: U.S. Election Assistance Commission

In 2016, we made several mistakes in how we framed early votes. I’d argue the two biggest errors were a lack of historical context applied to the analysis (such as how the vote compared with prior elections) and not giving enough thought to whether the surge in early Democratic votes represented a simple cannibalization of Election Day votes rather than a true surge in Democratic intensity.

Regardless, ignoring the early vote this year would be a mistake. While there are pitfalls to analyzing it, there is also much to be learned. The important thing is to recognize what an early vote can and cannot tell us and to keep the analysis going with updated data.

At the time of this writing, close to six million Americans have already voted. Thanks to the depth of data available to the public — through companies like TargetSmart and others — we can perform rich analyses of the voting electorate in real time. To ignore this kind of granular data on millions of voters would simply be foolish, especially when you consider the attention we pay to polls involving much smaller sample sets.

What the Data Can Tell Us

Let’s start with the obvious: Early vote data tells us who has voted. While we don’t know for whom they voted, we can connect this information to voter files that contain a wealth of demographic and historic political data. And then we can infer some important points.

The first thing to look at is how the early vote compares with prior elections. We should consider, for example, whether any particular groups are surging relative to 2014 or 2016. Looking at party registration (or modeled partisanship in states that don’t allow party registration), age, race/ethnicity, and gender can tell us a lot. A surge in black voters, as the data for Georgia shows at this time, suggests that Democratic intensity is high. Conversely, if there were a surge in white men voting early, it would tend to bode well for Republicans.

We can infer a fair amount from party registration and modeled partisanship, but those are imperfect tools.

The second piece of context to weigh is the previous behavior of people who have voted early. Higher intensity may simply indicate voters who would have voted anyway but chose to cast their ballots early instead of on Election Day. That’s less meaningful in the grand scheme of things. However, if there’s a true surge of first-time voters or another “low-propensity” group (i.e., a group unlikely to vote) that skews heavily toward one party, that would indicate a meaningful intensity advantage.

We can break down the early vote into categories of vote history, from “super voters” on the top end of the scale to those who’ve never voted before at the other end. In looking at the vote share for each of these groups, a larger share or larger advantage for a party among low-propensity voters (i.e. infrequent or first-time voters) is a good sign for that party heading into Election Day.

Image: Tom Bonier

What the Data Cannot Tell Us

Most importantly, the early vote data can’t tell us how people have voted. We don’t know how many supported the Democrat, the Republican, or some other candidate. We can infer a fair amount from party registration and modeled partisanship, but those are imperfect tools. Polls might suggest that Democrats will overperform with better-educated suburban voters, especially women, which would put a dent in GOP numbers, but at the same time, Republicans may have made inroads with less-educated white rural voters. These crossover voters confound any attempt to use the early vote to accurately predict performance.

Voters are often required to have a reason to vote early, and that segment can skew older and more conservative.

The early vote data can be very misleading when presented without necessary context. For example, recent reports suggested that GOP advantages in early vote turnout in the first week perhaps indicated a smaller “blue wave” than most polls and analysts had been predicting. But the historical context was missing. The first week of voting mostly involves restrictive vote methods, like absentee voting by mail. Voters are often required to have a reason to vote early, such as an inability to vote in person on Election Day, and that segment can skew older and more conservative. If you compare the turnout with historic benchmarks, you’ll find that the GOP early-voting advantage in most states was actually much narrower than in the 2014 or even 2016 elections.