Are Presidential Polls Bullshit?
Almost four years ago, the election of Donald Trump took most by surprise. Even the pessimistic prognosticators gave Trump just a 28% chance of winning. This along with the earlier Brexit vote caused a lot of teeth-gnashing among pundits. Why didn’t we see these results coming? Why were the polls so off?
Now in 2020 we’re looking at a new election with brand new polls. Any sane person should wonder, can we trust the pollsters this time around? After 2016 the most famous poll analyzer of them all, Nate Silver, concluded that the polls were as accurate as they had ever been. The problem he said, was in people’s expectations of their accuracy. Personally, I think this is passing the buck, but I do agree that polls have always been far from perfect. Moreover, there are reasons to think they are even less accurate in 2020 than they were in 2016. In this article, I’d like to present some reasons for why election polling isn’t a great way to predict the future president.
How Polling Works
First let’s take a short detour to talk about how polling works. We can imagine all Americans fit on some spectrum of political preference. Most people imagine this preference looks like a normal distribution, or the classic bell curve. To them it might look something like this.
In reality, the distribution of Trump support is probably closer to something like this.
The idea of polling is to randomly sample this distribution. If the sampling is truly random, then you’ll get the same curve (and the same averages) with just a small number of samples. Below I took a random sample of 1000 points from the above set of 30 million numbers.
Although the sample result isn’t quite the same, it’s pretty close. If each curve represents Trump voters, then the first curve would give an even 50% vote. The sampled vote (second graph) would give a 49.4% result. The act of random sampling introduces a small error, but the result is still pretty good.
If this was the only error involved, then election polls would almost always predict the final result. However, as we shall see, there are other errors involved, and they all have do with with how truly “random” this sampling process is.
Polls Have a Higher Margin of Error Than They Report
The best way to know how accurate polls really are is to see how well they’ve done in the past. Luckily researchers at Stanford university did just that. After collecting historical election data and comparing it to pre-election polls, the authors of this paper found that the real margin of error in polls is about twice what is reported. As this New York Times article on the paper points out, polls are only good to within a margin of 12 to 14 points (+/- 7 points on either side). This means that any close election cannot be accurately predicted by a poll. Some argue that averaging multiple polls together can lower this error, but even this may not help. If all polls are wrong in the same ways, then averaging them won’t matter.
Bad Sampling Methods Introduce Unknown Errors
Now that we know polls aren’t all they’re cracked up to be, let take a look at some possible sources of their error. One issue involves the methods used to contact potential voters. In general there are 3 ways polls are conducted.
The classic way to conduct a poll is via landline phone. In the days before cell phones, this worked pretty well. However, as everyone knows , young people don’t have landlines anymore. In fact, only 6.5% of Americans still have a landline phone in their house. Despite this, some polls still rely on this antiquated technology.
In theory it should be easy to reach people now that everyone has a mobile in their pocket. Unfortunately with the advent of caller ID, very few people answer calls from random numbers. This has led to what some are calling a crisis in poll response rates.
Think about it, with con artists and telemarketers spamming us, what kind of person picks up a random call? According to Harvard Business Review, it used to take just 2000–2500 calls to get a nice sample of 800 Americans. Now it takes closer to 9000 phone calls. Whenever you see a phone poll, you should know that it is comprised of the kind people who are willing to talk to a potentially dangerous stranger.
The newest form of polling involves using surveys on the internet. This method allows pollsters to reach many people very easily, but it too has issues. Some worry about “mode” effects, where people answer a poll differently online than they do in person. Personally, I think the anonymity of internet surveys better matches the anonymous nature of voting. However, I do worry about the type of people who answer online questionnaires. Generally, these people must be more trusting and less busy. In fact, Pew research has found that “online opt-in polls … tend to overrepresent adults who self-identify as Democrats, live alone, do not have children and have lower incomes.”
If we look at the most recent election polls, we see that they rely mostly on internet surveys. The exceptions are Rasmussen, Fox, and Monmouth, which rely heavily on phone calls.
The benefit of each of these methods is that pollsters have convenient access to many people. What these methods can’t do is reach every type of voter. Polling companies make great efforts to sample people of all races, income levels, level of education, and physical region. What they can’t control for is how hard-working these people are, how busy, or how patient. These things are harder to quantify. However by not accounting for them, polls remove an element of random selection from a process that depends on it.
Polls Tend to Oversample Democrats
We’ve already established that a close election will have problems with polling. On one hand, the margins of error might be larger than the difference in support for candidates. On the other hand, the sampling methods used tend to select for a certain type of voter. If online surveys really do seem to sample more Democrats, we should see this in raw polling data.
Below are summaries of the most recent National Polls from September 2020.
In every poll, Democrats were represented more heavily than Republicans. In most of them, they we more represented than registered independents as well. This may not be the fault of pollsters. After all, being a registered Republican is not nearly as popular as it once was. In fact, there are now more registered independents than there are Republicans. The oversampling of democrats follows national averages in political affiliation. Still, this lack of representation will no doubt skew poll results, especially among unregistered voters. I don’t know why no one talks more about this clear bias.
Polls Don’t Sample Casual Voters Well
When doing a post-mortem on the 2016 election, experts found something interesting. It turns out that voters who were labelled “undecided” ended up going for Trump in the Midwest. Voter turnout in rural areas was also higher than expected. Many brushed off this result, but I think it is extremely important. It shows that polls didn’t do a good job of capturing “undecided” voters, or those who weren’t expected to vote.
Again, I think this is mostly due to polling methodology. Politically active people are eager to answer polls, the average American is less so. Those labelled “independent” are underrepresented in the national polls, while those who aren’t registered aren’t even counted.This excludes all the voters who have children, full-time jobs and better things to do than talk to strangers or answer surveys. Ultimately elections are decided by two things, who voters prefer and how likely they are to actually vote. In my opinion, polls do a terrible job at capturing this second element.
As we approach November 2020, it is expected that the majority of the country will be under some kind of lockdown. After a year of widespread viral deaths and urban riots, it is very difficult to guess at the national mood. No doubt most people will be consumed by both fear of the pandemic and of fear that the opposing party could rise to power. With so many uncertainties, I personally doubt the predictive power of polls will be better this year than they were in 2016.
1 - For an detailed analysis of all the ways polling can go wrong, see this Cambridge paper