Home > News > The secret truths of election forecasting
135 views 9 min 0 Comment

The secret truths of election forecasting

- September 22, 2014

In case you missed it, election forecasting can be fractious. There is a “war” among various forecasting models of the Senate elections, we are told. I’m here to tell you: no, there’s not. Or at least, much less of a war than a grabby headline would suggest. And so, in the spirit of Kumbaya, here are the four secret truths of election forecasting.
1. Everybody is looking at the same information.
There are two types of ingredients in forecasting models. The first is “fundamentals” or structural factors that are known to influence congressional elections — measures of the national political climate, the partisan leaning of districts or states, candidate fundraising, etc. The second is polls. The models that take into account fundamentals — ours at Election Lab, The Upshot’s Leo and 538’s — do not draw on an identical set of factors, but the factors in each model are very similar nonetheless.
The second ingredient is polls. All of the models use polling data, and three models — Pollster’s, Sam Wang’s and Drew Linzer’s at Daily Kos — rely exclusively on polls. So do other election handicappers. As Stuart Rothenberg has written (gated):

In fact, during the final six weeks or so of an election, my assessments of races are based almost entirely on state-level and district-level survey data… (his emphasis)

There are, as always, private polls conducted by candidates or parties and shared with a select few. And there are, as always, strongly expressed views that these polls are superior to the publicly released polls. For example, here are Chuck Todd, Mark Murray and Carrie Dann:

That said, this year should be a reminder as to why to be leery of any political handicapping site ONLY using released polls as their basis for prediction. These aggregation and regression analysis sites are trying to find accuracy with incredibly inaccurate data. That’s not exactly scientific.

If this is true, privately conducted polls should be systematically closer to the eventual election outcome than public polls. But, to my knowledge, we have no evidence for this idea, just assertions — such as Todd, Murray and Dann’s claim that public polls are “incredibly inaccurate.” Indeed, in the piece noted earlier, Rothenberg — who is privy to private polls — expresses some skepticism that they are systematically better than public polls:

Over the years, those data, which were provided to me by party operatives with the caveat that they could not be made public, have been very reliable. Recently, however, I have been growing less comfortable with the polling.

So, for the time being, it is not clear that private polls provide significantly different information than the public polls. Thus, it’s reasonable to say that everyone is looking at the same basic information.
2. The models are very similar.
The models that take into account fundamentals all do the same basic thing: examine the historical relationship between these factors and previous elections, and then project an Election Day outcome based on where those factors stand right now.
In addition, all the forecasters have some kind of methodology for averaging polls, which is then either combined with the fundamentals or serves as a stand-alone forecast. There are subtle differences in how everyone computes these averages — see, for example, Nate Silver, Natalie Jackson of Pollster, Sam Wang and The Upshot. But it is not clear that these differences lead to wildly divergent poll averages. Nate Silver notes, and I would agree, that taking account of an estimate of “pollster quality” — which 538 does and we do not — has a small effect.
Moreover, even the models that combined fundamentals and polls are giving significant weight to the polls. This was true, for example, of The Upshot’s model by late summer. It is true of our model as well.
For this reason, I am skeptical of the hypothesis that any differences among the forecasts are due to whether the underlying model takes into account fundamentals or relies solely on polls. Speaking for the model at Election Lab, I can say that in competitive Senate races with a substantial number of polls, it is the polling average that drives the forecast and any changes in that forecast. Nate Silver has made a similar point as well.
3. Most everyone has a similar forecast.
Here’s the graph from Vox:
Matt Yglesias noted that the Senate forecasts are “actually pretty similar,” and he is correct.
Although Sam Wang’s forecast is more optimistic for the Democrats, most models are producing nearly identical forecasts — overall, and in individual races.
To be sure, as Mark Blumenthal and the Pollster team note, there are disagreements about how confident we can be in particular forecasts. This is the crux of the dispute between Nate Silver and Sam Wang, and it is a perennial challenge for forecasters, as Ben Lauderdale and Drew Linzer describe here. We, too, have spent some time trying to “get the uncertainty right.”
No doubt we all will continue to learn more about how to estimate the uncertainty underlying election forecasts. But I don’t want this to obscure the fact that most of the models are coalescing around a similar answer — as they should, given the similarities in the data we’re all using and the models we’ve built.
4. The election won’t tell us whose model is “correct.”
It would be nice if the election could crown the “best forecaster.” But, echoing Hans Noel, I would strongly urge against the tendency to do this. Here is Noel:

When the forecast calls for a 60% chance of rain and it doesn’t rain, you don’t conclude that the meteorologist was wrong. You conclude that it might have rained, but it didn’t.

Of course, I will be pleased if our forecasts are correct — especially in races like North Carolina, where early predictions based on the underlying fundamentals were somewhat controversial. And some models might end up performing better in this particular election. But evaluating forecasting models will require many years of elections, not just November’s.
If Election Lab calls all 36 races correctly and no one else does, we won’t be dancing in the end zone. In reality, some smart people are all building plausible models of Senate elections and, in our case, House elections, too. I’m glad that so many are thinking carefully about the challenges inherent in forecasting.
These four points are probably self-evident to any close observer. In which case, I lied about “secret.” (Apparently I can’t resist a grabby headline either.)
The much less grabbier, but ultimately correct, story is this: the “war of the forecasting models” has quickly become “the unexciting consensus among the forecasting models.”