Friends,
The other data, someone sent me this Twitter thread from Dave Wasserman, an elections handicapper for the Cook Political Report. In it, Dave explained why he thinks the presidential race is closer to a 60-75% chance of a Biden win than a 90% chance. While I disagree with him, I thought he made some good points in a respectful manner, so I fired back a thread in the spirit of learning. I want to expand on that thread with this post today.
As I blogged about last week, I think there is a lot of value in incorporating context and prior judgment (or what some people call "subjectivity") in statistical models. That’s why being Bayesians — a branch of statistics concerned with probabilities and incorporating judgment into models — is so great!
But I do think that Dave’s 60-75% for Biden is wrong. The statistical model I helped build for The Economist has Biden at 88%. That's a big difference! However, I think it's actually pretty easy to explain.
Dave and I probably start off ingesting the same sources of data, which is almost all good news for Joe Biden. Donald Trump's approval rating, for example, is low and has been in the same range since roughly June of 2017. Another point is that incumbents typically get blamed for a bad economy — and all the data suggest we’re living in a pretty crappy one right now. The pre-election polls have hovered between an eight and ten-point margin for Biden for months, and polarization makes them more predictive and less variable. The early data on the DNC and RNC also suggest that they probably won't matter in the end either.
The two good pieces of data for Trump, by my accounting, are (a) that annual growth in real disposable income has been above average, thanks to Congress’s CARES act, and (b) that the electoral bias provides a bias for Trump because the key swing states lean to the right of the national popular vote. By my measure, Trump could lose by between two and three points nationally and still be favored to win the election.
But Dave and I start to diverge once we start creating probabilities out of these data. Dave looks at the numbers and perhaps thinks "Well, they are good for Biden, but I think they're only reasonably predictive, and chaotic things can happen, so I'm going to give them some but not a whole bunch of weight." He then balances the actual data about the race against his subjective judgments, part of which he has said includes that polls were wrong in 2016 and that Trump has a good stock of votes left among non-college whites in the Midwest that can help him make up ground. That's fine and a valid way to think about elections — but is it probabilistic?
No, I don't think it's the most probabilistic, or even the most predictive, way to model the 2020 election. That’s primarily because I’ve had success with quantitative handicapping that out-performed Dave in the 2018 midterms (my model called the House for Democrats in January of 2018 while Dave was still in tossup territory) and still did reasonably well in 2020. But more to the point, it’s also because I think a lot of pundits and analysts, Dave included, are focused on the wrong thing when they’re making their mental models of elections. Instead of focusing on the average expected outcome and simulating the election, they’re focused on the presence of error in our models. What if the economy ISN'T predictive? What if polls ARE wrong again? Etc.
Those are good questions to ask, but they probably lead “qualitative” race-raters like Dave astray. They tend to focus on how wrong a model can be in extreme cases instead of considering the likelihood of those cases. These judgments can get corrupted by a lot of psychological contaminants that impact the accuracy of forecasts, too—like recency bias (only considering how “wrong” polls were in 2016 and not focusing on their overall track record) and some severe analytical paralysis (having too much data available to draw inferences can lead predictions to tend toward 50-50).
So, this is where our model comes in. What we're doing with our statistical model is comparing data today with a historical stock of data and ask the question of how often candidates with numbers like we see now have gone on to win the election, and quantify the error term associated with these relationships. Instead of thinking "well, polls have been wrong before" or "well, there are lots of non-college whites in Wisconsin," our model tells us HOW wrong polls have tended to be and HOW LIKELY working-class white (WWC) voters are to change the trajectory of the race. Quantitative analysis... helps you quantify!
Now, to talk specifics. Maybe you have more information that you want to include in your model. We, for example, tend to focus on how paralyzed polarization has caused the electorate to become. As far as I can tell, Dave is ignoring that in his mental model of the race. This leads to people seeing huge movement in races such as 1976 and 1988 (which we will revisit later) and thinking "oh, polls can move around a lot, so Trump can still win the election with ease." Well, that probably skews our judgment of the variance in the polls already, AND it ignores that movement in the polls has gotten smaller over time. That helps Biden's odds relative to Dave’s 60% baseline.
Now, on the other hand, maybe you want to include some prior information about polls missing WWC voters, or about how Trump has some backlog of support from them (Dave has raised both these possibilities). Cool! You can use that information statistically if you also have a sense for the uncertainty around it. For example, setting aside that I don't think this is the right call to make, if you wanted to say that Kenosha or the RNC makes Biden-voting non-college whites more likely to desert him, maybe by 3 points, +/- 5, you can put that information in your model. Dave's mental model is doing this.
In this case, that's one of the reasons why Dave is closer to 67% Biden than 88%. But it’s actually not correct to say that quantitative models aren’t picking this up. By modeling the dynamics of poll movement over time, we know with what magnitude a boost for a candidate among any number of permutations of groups can change the overall horse-race. And it’s closer to a small effect than a large effect, especially in paralyzed elections. This is a key part of what makes “quantitative” forecasting powerful: If Dave had run statistical analysis of the polls, instead of focusing on the fact that Trump could gain ground with whites, he would have known how likely that is to actually change the election (not very).
Alright, that's the stuff I think we can learn most from. But for the sake of posterity, I also think there is quite a lot of data and thinking going into Dave's mental model that is more toward the TV pundit analysis spectrum of handicapping than his usual high-quality work.
For example, Dave says that voters have short memories and so might stop punishing Trump for covid-19, and that our model doesn't account for that. But actually, by way of modeling historical dynamics in pre-election polls, the model knows average voter "memory" over the past 18 cycles. What we’re talking about is an autoregressive process here — how well polling averages on one day will predict the next — and we know that the day-to-day correlations in polls are quite high now because of polarization and getting higher.
Sure, Dave could make the case that voter memory is shorter now than it used to be and draw the inference that swings in the polls are likelier than in 1970, but this misses two broader points that (a) it's partisanship that is driving the majority of voter choice, and (b) that polarization has made swings LESS likely, not true. So his point is actually just wrong here.
Dave also makes the case that Trump can influence the media in a way that decreases voters' opinions of the other candidate. But this inference is based on a sample size of a n=1 previous election that frankly probably isn't predictive this year. Joe Biden is no Hillary Clinton (in terms of a backlog of political scandals that dogged her for her entire career) and misses the point that Trump has been attacking Biden for months with 0 change.
And finally, Dave makes are three points about (a) how "plausible" he thinks a five percentage point gap between the electoral college and the popular vote is, (b) how there’s a 4%+ chance of a tie in the electoral college and (c) that Trump steals six percentage points of Biden vote margin by way of trouble with mail-in ballots. But we have actually run the numbers on this and found that Dave is overstating likelihoods in all of these scenarios. See this and this and this.
So, to recap quickly:
I think that models that incorporate external information about the world are good, so long as people do it statistically. I like priors too! I just think that looking at data is a little more valuable than Dave does, and especially that formalizing the way we draw inferences via a statistical model could help him avoid falling victim so some of the classic probabilistic heuristics (like focusing too much on the 2016 polling error or Trump’s supposed political prowess).
And here’s my footnote about 1988, which people keep bringing up.
Yes, Dukakis blew a big lead in the polls. But that lead was a mirage, caused by one single poll released in the summer of 1988. While one had him up 17 points, the others were closer to 7. And we should have expected the race to tighten anyways because of the fundamentals.
Plus, as I keep saying, polarization makes swings of that magnitude less likely as more people have already made up their minds about the election than back then. According to our analysis, there are only half as many swing voters as there were back then.
You are brilliant and I hope you are right!