By far the most common question I get this time of year is some variant of “who is going to win the election” (or “is [enter your party here] going to do as well as people say?”, etc). This year, I really don’t have a clear answer, at least in the Senate.
That’s because the polls today are just really, really close. That’s true if you look at the polling averages at FiveThirtyEight.com — where 8 Senate races are within the normal range of polling error — or the model run by the Economist data team, where 6 races are within 4 points.
That means that this year, in stark contrast with the last high-profile forecast I made, my answer is really only slightly bluer than a shrug. Control of the Senate probably favors Democrats today but could very readily go either way. The polls in key Senate races are close enough that partisan nonresponse or errors in likely voter models could easily throw our measurement of public opinion off enough to be giving us a consequentially distorted view of the electorate’s likely behavior. The Republican Party, in other words, is less than a normal polling error away from controlling the Senate. (The same thing goes for Democrats in the House.)
But what do we mean by a “normal polling error"? Sure, statisticians and election forecasters might be able to intuit the median of an expected distribution of error… but that is gobbledygook for most people.
What I find much more helpful is presenting different forecasts of elections conditional on some amount of polling bias in each state. This is what modelers tend to call “conditional forecasting” — where they change a parameter in a model and see how that changes things downstream. Assume, say, we are modeling the spread of covid-19 in the US in April 2020. We could raise or lower our assumed rate at which the virus spreads to be higher or lower and present different scenarios to people interested in the forecast. We could say, if the virus is more contagious than we thought, then X people will probably get it—an increase of Z over the Y-people-infected estimate from our first model.
Similarly, we can look at changes in election probabilities (and, in this case, expected Senate seats) if we change the polling bias parameter of our model. I find this is intuitive for a lot of people: “If we assume the polls are as biased as they were last time, we think X will happen in the Senate. If we assume they are not biased on average, then our forecast is Y.” Conditional forecasting may be the next step in how we communicate election forecasts and polling aggregation to the public.
But before we get there, let’s just look at the results for 2020. Here’s what I’ve done:
Take the FiveThirtyEight polling average in every Senate race as of October 22, 2022
Take my historical database of polling averages and calculate the expected error in the average Senate race at today’s time horizon before polling day (18 days out)
Also measure the bias in polls in 2016, 2018 and 2020
Calculate the share of Senate polling errors since 2000 that is shared between states nationally (this is necessary for simulating correlated polling error across states.)
Simulate the election once leaving 538’s polling averages today as-is
Simulate the election again adjusting 538’s polling averages for the average polling bias in each state since 2016
And I get these results:
This graph shows if you think the pollsters have fixed their errors in since 2020 (or will otherwise be unbiased on average), then the Democrats’ probability of controlling the Senate after November is about 64%— a tossup race, leaning their way.
If, on the other hand, you think the polls will be about as biased as they were in each state from 2016 to 2020, then control of the chamber is currently leaning towards Republicans.
Clear enough? It’s a close race. If polls are biased about as much as they have been recently — and I’m inclined to think that many of them are — then Republicans are probably poised to gain about a seat.
Of course, the polls could be even worse than the bias we are currently conditioning on! That’s why we simulate different errors around the mean and why the chart above shows a distribution of different outcomes. In some of them, Republicans win 55 Senate seats! Though I think this is pretty unlikely.
One important inference? Whether you think Democrats have any significant shot at winning 52 Senate seats comes down to whether you also believe polls will be unbiased predictors of the outcome on average. If you don’t, there is a slim (not “no”) chance that happens.
Is this clearer than a probability on a forecast? What do you think? Share your comments below. Happy weekend.
I really like your conditional approach to polling forecasts. I find that the conventional approach has an air of fake precision. The conditional approach is much clearer.
I have seen a trend in the polling data (maybe it's my imagination) ...Biden's poll numbers are slightly better when likely voters are sampled...Many polls on his approval have been based on A or RV voters...It's probably nothing...
It seems clear that the GOP has positioned its minions to prevent people from going to the polls by making the polling places hard to reach and few and far between; to harass and intimidate voters at the polls, to claim victory or fraud prior to final vote count, to have secretaries of state stop the vote count, to file more litigation demanding that the governor alone has the authority to name state winners, to interfere with vote counting, etc. You can't poll on the probabilities of those outcomes. As Stalin didn't say, it doesn't matter who votes, it matters who counts the votes. Or says s/he counts the votes.
I conditionally approve of this approach!
I really like your conditional approach to polling forecasts. I find that the conventional approach has an air of fake precision. The conditional approach is much clearer.
There appear to be many new voters, especially young and women. Are they represented in these polls?
I'm thinking about what happened in Kansas.
I have seen a trend in the polling data (maybe it's my imagination) ...Biden's poll numbers are slightly better when likely voters are sampled...Many polls on his approval have been based on A or RV voters...It's probably nothing...
It seems clear that the GOP has positioned its minions to prevent people from going to the polls by making the polling places hard to reach and few and far between; to harass and intimidate voters at the polls, to claim victory or fraud prior to final vote count, to have secretaries of state stop the vote count, to file more litigation demanding that the governor alone has the authority to name state winners, to interfere with vote counting, etc. You can't poll on the probabilities of those outcomes. As Stalin didn't say, it doesn't matter who votes, it matters who counts the votes. Or says s/he counts the votes.