Online and live-caller polls disagree about Biden’s lead over Trump
The pattern muddies the waters for election forecasters in 2020
Hello loyal readers!
My computer is ticking away on some election models so I thought I’d send you a little note while it finishing thinking.
I have spent a lot of time collecting state-level polling data for the 2020 presidential race over the past week or so. In all my rummaging of PDFs and news reports, I have noticed an interesting, if familiar, pattern: polls conducted online are showing very different numbers from polls conducted over the phone with a live interviewer.
Online polls have tended to be much more favorable to Joe Biden. In Florida, for example, the average of live-caller polls has Donald Trump up one point over (and inside the margin of error with) Biden, but polls conducted online have Biden up by 4. One firm even conducted two Florida polls a month apart, once each via phone and web, and found a 6-point difference in Biden’s projected margin of victory.
These differences sound small, but they are quite large when added up. Accounting for the fact that state-level polls taken this early contain a pretty high amount of uncertainty—a standard deviation of roughly five percentage points over the last three cycles, per my estimates—the online polls would imply a nearly 70% chance that Biden carries Florida. But the live-caller data would flip the scales and imply a much more modest 60% chance of victory for Trump.
Worse, this pattern is common enough across states that if you trained a predictive model only on live-caller polls, it would give you a prediction where Biden is up with a 60-70% chance of winning the electoral college vote, whereas the online polls give him a much better chance—closer to 80-85%, if not higher.
So, what gives? Who should we trust?
In theory, my prior is that live-caller polls of states are probably closer to the truth than the online polls. For one, they’re closer to the result I’m expecting. For another, the firms that run live-phone polls typically have longer and better track records (as well as methodologies that are generally easier for me to find and scrutinize online).
But the answer is probably not this clear-cut in practice. All my methodology-reading has revealed that the online pollsters are essentially just as likely, if not more likely, to use correct statistical methods (like weighting for educational attainment, for example) as the traditional firms. Of course, we should expect this to be the case as polls fielded online require careful adjustments (typically more than for a randomly-selected sample from telephone registries and the like) to match the voting population, but it is nevertheless reassuring. Online polls also tend to have larger sample sizes, theoretically increasing the stability of their estimates and statistical robustness (though I’m not sure this is true in reality).
An alternative hypothesis is that it might just be too early in the election to have a large enough sample of polls for this analysis to be meaningful. Polls are only estimates, after all—(somewhat) randomly-distributed measurements around the “true” latent trend in (measured) vote intentions. And there are lots of variables involved in a poll that could explain these differences that are showing up in an average of less than 10 data points. Pollsters have to identify a survey frame, select a method for selecting a sample off that frame, design weights and likely voter screens and draw up questionnaires. Each step is its own source of uncertainty, and that’s before we get to the statistical sources of biases that are inherent in surveying.
In sum, this is very much still an open question. I’ll be tracking this issue as the election gets underway, but in the meantime, be sure to note the mode in which a poll is conducted whenever one comes your way. If it’s an online poll, maybe treat it with a bit more skepticism than you otherwise would.
Hello, Elliott,
Do you find that people lie to telephone pollsters? I have found that the querent’s voice, tone, word choice, accent, can color the respondent’s responses. Got data?
I still am looking for a list of which pollsters weigh by education. 538 lets you see an aggregate of polls of all polls and registered/likely voters. I want that but for polls that are weighted by education.