Online pollsters tend to take a model-based view of survey research. They don’t think polls work simply because of their design — though a good design helps — but instead because demographics and other characteristics can predict vote choice, and polls are weighted to match the country’s demographics.

But the model-based view doesn’t offer the clarity of the old, exclusively design-based way of thinking. Taken to the extreme, the model-based view could allow a good model to overcome the weakness of even the lowest-quality sample, in effect turning dirt into gold. A study, for instance, did a decent job of estimating the 2012 election results based on an Xbox poll.

Online polling remains difficult

Just because it’s possible to turn dirt into gold doesn’t mean every nontraditional pollster has figured out how to do so. It’s quite difficult, and it’s also hard to tell whether a pollster has unlocked the dirt-to-gold alchemy.

What we know for sure is that much of the data collected by nontraditional means is of the dirt variety. For example, a 2016 study from Pew Research found there were substantial differences in the quality of online nonprobability samples: All but two of those samples fared worse than Pew’s online probability panel, which was recruited from Pew telephone surveys.

The online samples that fared well raise some concerns about the other online pollsters, if only by implication. YouGov, for instance, consistently fared best in the Pew test in part by using a distinctively sophisticated method of sample selection, called synthetic sampling. YouGov selects individuals from its panel of respondents, one by one, to match the demographic profile of individual Americans and thus match the country’s demographics as a whole.

Another survey has succeeded without probability samples, but by means that also raise doubts about other online surveys: the VoteCast survey, a new competitor to the exit poll fielded by NORC at the University of Chicago and sponsored by The Associated Press and other news media outlets.

The VoteCast polling of the midterms was conducted with unusual transparency, and it was a rare foray into this type of research by a well-regarded, traditional organization. The study combined a traditional telephone survey of 40,000 respondents with a large nonprobability sample of 110,000 respondents. The online-only element was calibrated in part by using the telephone survey data, and the overall results were reasonable.