Why were pre-election polls and forecast models inconsistent with the election of Donald Trump?
As it's obvious now, pre-election polling and forecast models were inconsistent with the vote count in this election cycle as not many predicted a Trump victory.
Is there any particular reason why polling was so off?
Related NYT article link: Why Trump Won: Working-Class Whites "The result was that many postelection analysts underestimated the number of white working-class voters over age 45 by around 10 million." (Although, as mentioned in some of the answers, the polls weren't actually off by that much, with the results of the various state elections falling within the margin of error of the polls that had predicted Clinton victories.)
It seems to me that rather than the polls, the problem was the interpretation of their meaning by media. The polls showed a tight election, with a small advantage for Clinton in the popular vote. That is exactly what happened: Clinton won the popular vote by a small margin. That the press misinterpreted "tight race, witha a small advantage for Clinton" as "Clinton is already elected" is a different problem.
Compare the French elections. Polls were completely wrong, by a large margin. Polls predicted a Macron victory by a wide margin. He won by a **much** wider margin. But since he still won, the polls got away with their blunder. The public is under the impression that if the polls say "X has a 0.5% advantage", then X must win. A 1 percent point mistake reverts the result, and presto: the polls were wrong, useless, they lie or are corrupt. If in the other hand X wins, not by 0.5 pp, but by 15, then hey, the polls were right, they predicted X's victory!
From the earliest of polls, Trump was always within the margin of error vs Clinton, and she did wind up getting a few million more popular votes. How, exactly, are you claiming that the polling was off?
I second what @LuísHenrique and PoloHoleSet said. Here's an article from Five Thirty-Eight from just before the election laying out just how close the polls were and how overconfident pundits were about a Clinton victory: https://fivethirtyeight.com/features/trump-is-just-a-normal-polling-error-behind-clinton/
Reasonable theories I have heard have included:
A change in polling foundations (home phones become cell phones = limitations on traditional cold calling... and also the shift into online polls). Plus perhaps the diminishing patience people have with enduring the polling process (I believe the percentage of people who agree to it has dropped consistently)
A hesitation/fear that keeps some people from admitting that they support a candidate who is being widely presented as detestable/deplorable in much of the public discussion/media. [apparently an effect common enough to have a term, see the Shy Tory Factor... the possibility it might be important was even presciently suggested in a question here a couple weeks ago]
A possible tendency for people to publicly show full support for minority/heroic candidates because it's the socially favorable thing to do, even when they're actually entertaining uncertainty internally. [this is along the lines of the Bradley Effect]
People failing to turn out as they've indicated they will... perhaps due to weather (though it's fairly unlikely this was a big factor in this election), a false sense of security (such as when the polls in the days leading up to and into election day suggest a comfortable victory!), or just a general failure to muster the effort/will to follow through with the voting.
Regardless, there is a widespread trend of polls failing recently, and maybe even specifically underestimating the conservative side. In 2015, the UK Parliament was projected to come out about even between the top two parties, but the conservatives won by over 5%, ending up with the majority (which was considered to be a near 0% chance possibility as the day began), this summer Brexit passed (considered almost certain to fail as the day began, ended up passing by 4%), and last month the Colombian Peace Referendum failed (after being consistently polled to pass by about 10%). So perhaps this is a trend to be aware of going forward until polling methods can hopefully adapt. Others here have also pointed out there was an underforecast conservative swing in the 2016 Iceland election and changes in the Swedish election.
Note, though, that most models showing the full spread of possibilities didn't say this was a set Clinton win, but that it leaned maybe 70-80% likely that Clinton would win.
A 1 out of 5 chance of being wrong is not insignificant...
If they say there's a 20% chance of rain today, you shouldn't be surprised if it does rain! In meteorology we've often got the same continuing struggles, particularly when it comes to issues like hurricane/storm forecasting; getting people to understand the uncertainty and full potentials of realistic possibilities. This is something we should keep on our media to better portray to us, and something school math courses could better focus on, perhaps being of significant benefit to a great many.
Considering that most polls have maybe a 4-6% typical error, this result was really quite within the range of possibility for most (although their consistent bias suggests that there really are some fundamental shortcomings). But, still, most of the quality election forecasters did also measure in some considerations of this trend towards less poll reliability, and were urging caution regarding overconfidence in the indicated spread (as Nate Silver of fivethirtyeight did here on election morning).
A 20% chance of scraping a win does not translate into a significant chance of sweeping through and flipping the states he did. If you'd been betting on that before the election, the odds you'd have been offered would have been astronomical. It's clear this is not just a one-in-five event: the polling was just outright broken, and if the left wing is to stand a chance of being able to plan their strategies properly in future, they need to figure out a way of getting more realistic forecasts.
Ok, let's see where this result was pegged. https://i.stack.imgur.com/r60uG.png is a zoom of the fivethirtyeight graph from the start yesterday morning. It seems at least half the red highlighted region volume is over 300... and maybe a third over the fairly likely 305ish value. That would work out to be around a 9% share of their possibilities. Lower... but not astronomical.
Also consider - historically, Democrats do better in polls. This is not intentional or a bias, but the natural result that certain political parties are more likely to respond to polls.
You ignore the most probable reason the mainstream polls were wrong. The press has been manipulating the data for decades and have gotten away with it because the people had no means to fact check the press. They realize most people are lemmings and just follow the crowd, so spouting off how "their" candidate is leading in the polls has tended to tilt support in favor of "their" candidate. Thanks to the Internet, people now have access to all the information they could possibly absorb. So perhaps, the press manipulating elections via their poll reporting simply no longer works.
You also miss the fact that properly done polls would have shown a much more likely possibility that Trump could win. After all, Trump's internal polls showed that he needed to campaign in Michigan and Wisconsin. The press mocked him for it since those states were already in the bag for Hillary. Somehow, I don't think Trump's pollsters have any more knowledge on how to do proper polling, so that eliminates incompetence on the news media part and only leaves intentional misleading or willful ignorance as the only other plausible explanations.
Another issue are that polls only include what they consider to be likely voters, and if low-propensity voters turn out the polls will not reflect that effect.
Another important, and often overlooked, thing to consider is the construction of the sample population in the poll. Pollsters use certain demographic information, assumptions, and trends to build a model electorate. My guess is that sample populations used in may polls will not be representative of the actual electorate.
Most analyst did not considered that error in polls in different states is not independent but might be in same direction (for states with similar demographics): Over-performing in Ohio means also over-performing polls in Michigan, Wisconsin and Pennsylvania. 538 did warned about it.
@Dunk, look, I can verify media bias, it's something I directly see almost daily in how weather events are presented. And I absolutely agree that the media showed sweeping imbalanced in this election. Plus scientists are very prone to fail at identifying the impact of their own bias to their work. But, for what it's worth, any suggestion that the media actually systematically manipulates polling data itself personally seems foundationless based upon both the character of the data scientists I know and the close similarities to my own scientific background.
Shortsighted errors in selecting people and questions, I can totally see. And a manipulation of the narrative by applying selective emphasis in the media, absolutely (though it can go both ways; they generally played up the tightening race early this week while most polls were no longer showing that). But when people expand it to suggest a full-blown conspiracy, it only ends up muting out the importance of very real problems.
I did just see another consideration put forth on Inside Edition of all places: Trump and many conservatives have been vocally proclaiming that the media is corrupt throughout the election. Well, it reasonably follows that a portion of those who believe that would therefore actively choose not to participate in polls because of that belief. And it too is pretty long-held belief about the media at this point. So if you're upset at the bad polling, and want a more reliable polling estimate, it seems you'll have to move towards convincing those people that their concerns are being heard there?
Key contributor that goes along with the shift from landlines to cell phones is that there are many older voters who view voting differently than younger generations. For them it is a duty and a responsibility to get out and vote. The numbers of retirees who do not participate in social media may be dwindling, but the proportion of those in that group who got out and voted would probably shock and shame millenials and Gen Xers. Polls are slanted toward gathering data through social media. So they missed a segment. One that actually turns out in high percentages.
Don't forget the weighting. Most polls weighted women votes rather heavily towards Clinton, just because she's a woman. It was rather optimistic, just like the previous weighting Obama's election polls put on blacks.
@DewiMorgan Nate Silver wrote a nice article which basically addresses the point. The basic idea is that if 1 in 100 people across the nation switched from Trump to Hillary, the map would look basically like it was expected to (Hillary with 307 electoral votes). Then all the articles are about how the Republicans are completely crushed and swept aside. And that's with just 1 in 100 people changing their minds. The margins on this stuff are very small, and the conclusions we draw are pretty bombastic.
Andrew Gelman gives a rebuttal to the "Shy Tory" theory: "It’s possible, but I’m skeptical of this mattering too much, given that Trump outperformed the polls the most in states such as North Dakota and West Virginia where I assume respondents would’ve had little embarrassment in declaring their support for him, while he did no better than the polls’ predictions in solidly Democratic states. Also, Republican candidates outperformed expectations in the Senate races, which casts doubt on the model in which respondents would not admitting they supported Trump."
There is a problem, you can't blame the noise. The error is systematic. We can see that by the correlation between the progressive vs conservatives issues. Combining all the pools we have a hint of the real problem. The Shy Tory Factor and Bradley effect seems to be the reason for such mistakes. And, maybe, some "evil" agent in all this pools could be driven the resultsin a biased way.
"A hesitation/fear that keeps some people from admitting that they support a candidate who is being widely presented as detestable/deplorable in much of the public discussion/media. " I think it's a key. Polls are just a part of the same media song, they present only ones that follow their point of view.
Hillary Clinton still won the popular vote, though by a slim margin, still reflective of the polls. She only lost due to the way individual votes aggregated and translated into electoral college votes. Could here have been pollster miscalculation in the way polling results would actually translate to electoral college votes?
Your point 1 or 2 are at best partially correct. Australia has maintained a high degree of polling accuracy, despite undergoing the same landline to mobile change. Australian pollsters also correctly forecast the vote share going to our version of Trump/Brexit (One Nation party). I would argue the root cause of polling innacuracy is "an inability to determine who will turn up and vote". This might be excacerbated by the existence of a populist position/ person, but that is not the root cause
Polls have over estimated democrats for some time. The most plausible explination I've heard for this is the tendency for democratic voters to be more inclined to agree to answering polls compraed to republicans, causing them to be over represented after those that were asked but refused to answer are excluded from the totals polled.