Wired has a long, fascinating article about political campaigns increasingly turning to very high-level data analytics rather than conventional polling to guide their decisions. The article focuses on Civis, a company founded by two former Obama data guys, who point out that old-fashioned polling just isn’t as accurate as it once was.





DURING PRIMARY SEASON, when they were still mainly just spectators to the 2016 presidential race, Dan Wagner and David Shor had a routine they liked to observe on election nights. The two men—the CEO and senior data scientist, respectively, of a startup called Civis Analytics—would stay late at work, drinking bourbon and watching returns come in. Their office, a repurposed industrial space in Chicago’s West Loop, would rattle every time the L train rumbled by. As much as Wagner and Shor were following the political horse race itself, they were also watching to see how the race’s oddsmakers were doing. The US polling industry has been suffering a crisis of insight over the past decade or so; its methods have become increasingly bad at telling which way America is leaning. Like nearly everyone who works in politics, Wagner and Shor knew the polling establishment was liable to embarrass itself this year. It wasn’t a question of if, but when—and how badly. It didn’t take long to find out. About 10 days before the Iowa caucuses in February, two major polls came out: One put Hillary Clinton ahead by 29 points; the other, as if it were tracking an entirely different race, showed Bernie Sanders leading by eight. In the Republican contest, Donald Trump topped the state’s final 10 polls and averaged a seven-point advantage. On the night of the caucus itself, the Civis office in Chicago was crowded with staffers gathered around a big flatscreen TV for a viewing party. They all watched as Clinton—and Ted Cruz—won the state. But the biggest polling train wreck came a few weeks later, when the Michigan primary rolled around. In early March, every single poll gave Clinton at least a five-point lead; some had her ahead by as many as 20 points. Even ace statistician Nate Silver’s FiveThirtyEight—a go-to site ever since he correctly predicted outcomes in 49 out of 50 states in the 2008 presidential race—gave Clinton a greater than 99 percent chance of winning.

And it goes into details as to why Michigan was particularly susceptible to such disparity between public polling and outcome:

He and Shor weren’t without sympathy for the pollsters in this case. Michigan, Shor explains, is one of the hardest states for any researcher to survey. For pollsters in an election season, it’s like the moment in the stress test that causes the already-ailing patient to collapse on the treadmill. First of all, pollsters in Michigan have to contend with the same methodological problems that have turned polling into such a crapshoot nationwide. The classic pollster’s technique known as random digit dialing, in which firms robo-dial phone after phone, is failing, because an ever-dwindling number of people have landlines. By 2014, 60 percent of Americans used cell phones either most or all of the time, making it difficult or impossible for polling firms to reach three out of five Americans. (Government regulations make it prohibitively expensive for pollsters to call cell phones.) And even when you can dial people at home, they don’t answer; whereas a survey in the 1970s or 1980s might have achieved a 70 percent response rate, by 2012 that number had fallen to 5.5 percent, and in 2016 it’s headed toward an infinitesimal 0.9 percent. And finally, the demographics of participants are narrowing: An elderly white woman is 21 times more likely to answer a phone poll than a young Hispanic male. So polling samples are often inherently misrepresentative. In Michigan, all these systemic problems are compounded by a uniquely dire local crisis of data collection. The state’s official list of registered voters—known in industry parlance as a voter file, typically a roster of names, addresses, and voting histories—is a mess. The economic collapse has driven many Michiganders to change addresses and phone numbers, a churn that disproportionately affects black voters. That made the polls for the contest between Sanders and Clinton particularly susceptible to atrocious sampling error. “A lot of the polling was showing Sanders doing unrealistically badly with African Americans,” Shor says. Wagner and Shor knew all this about Michigan because that’s their business—they are two of the most revered numbers guys in American politics— but also from hard-won firsthand experience. Four years ago, when they both worked for President Obama’s reelection campaign, they helped narrowly avoid an expensive ­debacle in the Great Lakes State by convincing their team to completely ignore the public polls.

Michigan was thought to be a safe state for Obama, but in June, 2012, the public polls started showing a narrowing lead and the campaign responded by planning to spend $20 million here to protect it. The Romney campaign had already shifted resources into the state, thinking they really could win it. But Wagner and Shor believed those polls to be wrong based on their deep analysis of far more data points and internal polling. They made a presentation to the campaign bosses and convinced them not to waste money and resources there. Obama won the state easily, proving them right.

The Romney campaign, looking only at the public polling data and their own internal polling, were convinced that they would win the 2012 election. Wagner and Shor predicted the outcome in every single state and the resulting 126 electoral vote margin of victory. And there are a couple of other data analytics firms that grew out of that machine that the Obama campaign built that are working for Clinton now and have been for more than a year. Trump, on the other hand, has done almost nothing. They’re still buying email lists, which they failed to build during the primaries, that’s how far behind they are on this.

Now it should be noted here that high-level polling analysis, as opposed to broader data analysis, still does pretty well in making predictions. Nate Silver has been able to predict with great accuracy the last two election results and even in this year’s primaries his models went 53 for 58. That’s pretty damn good. But they do that by filtering out the statistical noise by using averages and weighting the polls differently based on their historical accuracy.

So what should we be looking for in predicting the horserace? A few things, based primarily on the fact that the Clinton campaign has far more accurate tools than either the Trump campaign or the public polls:

The race will be hard to follow, given the poor quality of political polls. “Chill,” advises Dan Wagner, CEO of Civis Analytics. “Stay away from the day-to-day polls.” Candidates have access to better data than the rest of us, so watch where their campaigns are adding staff; those are the states that will be the most competitive. Here are some other telling signals. —­G.M.G. High Hispanic Turnout Against Donald Trump, that is. Minorities are a growing part of the electorate—now as much as 30 percent. If Hispanic voters turn out in droves to oppose Trump’s immigration rhetoric, Clinton will be able to pick off traditional battle­ground states like Nevada and Florida. If her campaign expands into red states like Arizona and Georgia, she’s angling for a landslide. Clinton Playing Defense Democrats have an electoral base of about 242 of the needed 270 votes—they can lose Ohio and Florida and still win the presidency. On the other hand, Trump, whose voters are overwhelm­ingly white, must win in traditionally Democratic rust-belt states like Penn­sylvania, Michigan, and Wisconsin. If Clinton starts moving to defend those states, expect a closer race. Republican Defectors Conservative suburban communities packed with upper-income working women and business types who find Trump’s antitrade stance abhorrent might deliver states like Virginia and New Hampshire to Clinton. In that case, it might also mean the Senate will end up going Democratic. If so, get ready for a slightly more expedited Supreme Court confirmation process.

Also bear in mind that Clinton starts with a considerably higher number of safe states and safe electoral votes, about 230 or the 270 needed to win.