Happy Saturday, subscribers!
The election for Virginia’s next governor is on Tuesday, and I genuinely have no idea how it’s going to turn out. Gun to my head I’d say Terry McAuliffe, the Democratic candidate, is the slight favorite. But his Republican opponent Glenn Youngkin has closed the gap in the last couple of weeks, and the polling averages are as close to 50-50 as they come. It’s a genuine nail-biter in the Old Dominion State.
But I don’t want to write a state-of-the-race post. Instead, I want to address some of the methodological artifacts that are causing polls to diverge. We have some surveys showing Youngkin up 8 and some McAuliffe up 4. If polls were truly normally distributed around a mean according to their margin of error—say, McAuliffe +0.5 or some such—then we shouldn’t see such a wide variance from sampling error alone. And that’s where particular choices for weighting and question wording come it. That there’s such a wide range means something else is pushing the polls around.
That something else is methodology. Primarily, we are observing huge differences between polls of registered voters and likely voters. That means the estimate a given poll produces will be influenced heavily by how it identifies people who are likely to cast ballots, and whether the assumptions underpinning that process are correct. For example, over the last 5 polls that have produced a horse-race number among both LV and RV samples, the LV filter has produced on average a 6 point swing towards Youngkin. But the precise effect ranged anywhere from 3 to 10 points.
The sheer range of estimates, combined with the fact the race is much more favorable towards Republicans than one would expect based on the 2020 results (Biden +10), has produced a vocal community of poll-unskewers online. People were quick to note that a recent Fox News poll produced a likely voter electorate that gave a recalled 2020 vote split that was about 10 points too favorable to Donald Trump than reality was. But this is the wrong target! The likely voter electorate could vary from 2020 for all sorts of reasons. Republicans could be more energized to vote, for example. So the real target for assessing how close the polls are coming to the population of Virginia voters is to look at the recalled vote among registered voters. That way, you’re comparing samples that are closer to each other. And in the Fox poll, recalled vote in the RV sample was around Biden +8, just 2 points off from the actual results of the election. So the LV-unskewers are kinda missing the ball here.
But the broader point here is that when methodological decisions are introducing this much extra variance into polling averages, it is a good signal the poll averages could have more error than usual. As a final example, take the most recent poll from Monmouth University. The pollsters used three different methods to produce a sample of likely voters—each time including voters with lower probabilistic scores for turnout, thereby generating estimates for a high-, medium-, and low-turnout electorate—and ended up with toplines ranging from McAuliffe +3 to Youngkin +3.
That one single poll can be pulled around so much by the decisions a pollster makes when weighting their data should caution you against putting too much weight on the aggregates—or, put another way, assuming that errors will be small. You should not be surprised if polls are off by 6 or 7 points next Tuesday. Such a range doesn’t make polls useless, of course, but it does mean that in a close race they aren’t going to tell us much other than the fact it’s a close race.
But maybe that’s more important than picking the right winner with laser-like precision, anyway.
I really appreciate the trend of showing different poll results based on different turnout/electorate models. It’s much more realistic and diminishes the possibility of having and overly simplistic takeaway from the poll.