Paul Alper pointed me to an explanation by Nate Silver of his election forecasting methodology, where Nate writes:

I don’t like to call out other forecasters by name unless I have something positive to say about them — and we think most of the other models out there are pretty great. But one is in so much perceived disagreement with FiveThirtyEight’s that it requires some attention. That’s the model put together by Sam Wang, an associate professor of molecular biology at Princeton. That model is wrong — not necessarily because it shows Democrats ahead (ours barely shows any Republican advantage), but because it substantially underestimates the uncertainty associated with polling averages and thereby overestimates the win probabilities for candidates with small leads in the polls. This is because instead of estimating the uncertainty empirically — that is, by looking at how accurate polls or polling averages have been in the past — Wang makes several assumptions about how polls behave that don’t check out against the data.

Alper asks what I think of all this. I basically agree with Nate—if Sam Wang really gave Sharon Angle a 99.997% chance of being elected to the Senate in 2010 then, yeah, that’s pretty good evidence that Sam was overconfident.

I actually discussed some of this in a post back in 2010 with the horribly academic title, “Some thoughts on election forecasting,” in which I questioned Sam’s naive statement, “Pollsters sample voters with no average bias. Their errors are small enough that in large numbers, their accuracy approaches perfect sampling of real voting.” As I wrote at the time:

This is a bit simplistic, no? Nonresponse rates are huge and pollsters make all sorts of adjustments. In non-volatile settings such as a national general election, hey can do a pretty good job with all these adjustments, but it’s hardly a simple case of unbiased sampling.

So, given all this, it makes sense to me that Sam’s probabilistic forecasts would be over-certain.

In addition, I disagreed with Sam’s opposition to “fancy modeling” and “assumption-laden models.” OK, “fancy” sounds bad—it sounds like the kind of thing that Marie Antoinette might do, right? But as I wrote back in 2010:

Assumptions are a bad thing, right? Well, no, I don’t think so. Bad assumptions are a bad thing. Good assumptions are just fine. Similarly for fancy modeling. I don’t see why a model should get credit for not including a factor that might be important.

And I followed up with my favorite Radford Neal quote.

So, yeah, I’m with Nate on this one. His model is complicated because life is complicated, and he’s trying to do the best he can.

That said, I have some sympathy for Sam, who’s a biologist who does election forecasting in his spare time. And it’s interesting to see that a very simple model, set up by a non-expert as a little side project, can come within shouting distance of something that took much more effort and has much more sophistication.

Nate’s model is not perfect either (in particular, last time I looked, the geographic correlations in the uncertainties didn’t seem quite right) but as Al Smith might have said had he been a statistician, the solution to the problems of statistical modeling is more modeling (if it seems worth the effort).

In short, I’m going with Nate (although maybe not to so many significant digits), but I think Sam is doing a useful service by providing a sort of baseline.

Let me conclude by emphasizing, as Nate himself says, that Nate’s forecast is only one of many similar efforts. As our own John Sides explains, we’re all using the same information, one way or another, to forecast. As a statistician, my intention here is to express agreement with John and Nate that, in election forecasting, we want to use our substantive understanding to construct a reasonable model, rather than to attempt some sort of mechanical procedure based on the polls alone.