For example, for phone-based surveys, people without phones would never be included in any sample. Of particular import for election surveys, the sampling frame includes many adults who are not likely to vote. Pollsters try to correct for this by using likely-voter screens — typically asking respondents if they will vote — but this screen itself can introduce error that can at times be larger than the bias it was intended to correct.

And then there is nonresponse error, when the likelihood of responding to a survey is systematically related to how one would have answered the survey. For example, as another one of our papers shows, supporters of the trailing candidate are less likely to respond to surveys, biasing the result in favor of the more popular politician.

A similar effect probably explains part of Mrs. Clinton’s recent dip in the polls, as Democrats became less enthusiastic about answering surveys when she appeared to be struggling. With nonresponse rates exceeding 90 percent for election surveys, this is a growing concern.

Finally, there is error in the analysis phase. In one example, as Nate Cohn showed in an Upshot article, four pollsters arrived at different estimates even when starting from the same raw polling data.

Other errors, which we believe are less important in United States election surveys, include effects of survey wording and interviewer bias.

All these nonsampling errors show up in two ways. First, polls within a race vary from one another slightly more than one would expect from classical textbook explanations. Second, and most markedly, polls tend to systematically overestimate or underestimate the true answer. Thus, even when many polls are averaged together, the result is biased in favor of one candidate or the other.

The accompanying graph shows, for each race, how much the average of all polls in that race differs from the final election outcome. If sampling variation were the only source of error, the polling average would be quite close to the election outcome. Standard theory says the average of this many polls should be within about half a percentage point of the true answer, and that this difference shrinks to zero as more polls are conducted.