The potential for nonresponse bias

One possible explanation for errant polls in 2016 is nonresponse bias, which occurs when those who participated in the survey had different opinions than those who did not participate (either because they could not be contacted or were contacted and refused). Of course, the threat of nonresponse bias is well known, and survey firms employ many strategies to prevent it. But with average response rates trending below 10 percent, nonresponse bias is increasingly a threat. Indeed, Nate Silver recently argued that nonresponse bias helps explain why the polls (especially state polls) underestimated Trump support.

AD

AD

But has the threat of nonresponse bias actually materialized? A 2012 Pew report found that it had not, arguing that properly conducted polls still “provide accurate data on most political, social and economic measures.” Furthermore, errors in presidential polling have not increased as response rates have declined. But what about 2016 in particular? This is where our research comes in.

Nonresponse wasn’t key to 2016

In October, our undergraduate public opinion course at Cornell University conducted two nationally representative surveys: an online survey fielded by GfK (n=1,541) and a phone survey (cell and landline, n=625) conducted with Cornell’s Survey Research Institute. Like almost all surveys conducted at this point in time, our surveys had Clinton leading Trump.

AD

But more than 20 percent of respondents did not report a voting intention for either Hillary Clinton or Trump, suggesting the possibility of “shy” or hidden Trump supporters. If so, the problem might not be nonresponse bias — instead, some Trump supporters might simply have been hesitant to report that they would vote for Trump.

AD

We included a question in our surveys that allows us to estimate which candidate undecided respondents actually supported: “If you HAD to choose, which presidential candidate do you find to be more truthful: Donald Trump or Hillary Clinton?” For “decided” voters, this question accurately predicted their vote intention 98 percent of the time in the GfK survey and 94 percent in the SRI survey.

For undecided voters, the emphasis on truthfulness and the way the question required a choice (“if you HAD to choose”) should have helped alleviate respondents’ concerns about being judged for their response. Indeed, nearly every survey respondent answered it. We believe that undecided voters’ answers revealed their fundamental partisan leaning.

AD

And once they did so, Clinton’s lead shrank in both surveys. For example, in the GfK survey, the standard question showed that 42 percent supported Clinton, 34 percent supported Trump, 11 percent supported some other candidate, 12 percent did not plan to vote and 1 percent did not answer. So Clinton led by eight points.

AD

But when asked the question about truthfulness, Clinton’s lead shrank by half — to four points. In other words, already in early October, pushing people to reveal their preferences by asking about truthfulness helped Trump.

What about in the states?

The two polls we conducted were national polls. And at the end of the day, the national polls were pretty accurate. A bigger question — and a more stringent test — comes from the state polls. The state polls were less accurate. Moreover, the polling errors were correlated across states, meaning that the state polls tended to systematically underestimate Trump.

But using the question about truthfulness, this is no longer true. To generate state-level estimates of Trump support, we use a common approach called multilevel regression and poststratification. Because we’re using a single national-level survey to estimate state-level opinion, the estimates come with more uncertainty. Nevertheless, two important patterns stand out.

First, the errors do not appear to be correlated. In five of the states, our data overestimated Trump support, and in six we underestimated Trump support.

AD

AD

Second, Trump is estimated to lead Clinton in two of the three key Midwest battleground states — Wisconsin and Pennsylvania — where state polling made Trump victories appear extremely unlikely. Even in October, our estimates correctly call the winner in Ohio, North Carolina and Virginia, as well.

Of course, this is an analysis after the fact. Nevertheless, our data would not have predicted a Clinton landslide in the Rust Belt.

Investigations of the 2016 polls will undoubtedly continue. But our results shed some light on the possible sources of polling error. We don’t think that Trump voters were unable or unwilling to answer polls. Instead, they were polled but hesitant to voice their support for Trump.

AD