A better understanding of how today’s polling works and what it can and cannot tell us will serve us well in the coming year’s trip through the pre-electoral Fun House.
Listen To This Story
In and around an election year, polls are constantly in the news. That includes the inevitable crowing, dismissing, explaining, and explaining away. The media and the public squeeze the numbers for their precious meaning juice. The word “momentum” pops up a lot. It’s the proverbial horse race, the latest polls our tip sheet, and we have little else to go by in sussing out which horse can do.
Poll bashing has been with us about as long as polling. But the current round stands out as especially hostile, bordering in some cases on the hysterical, leading no less astute an observer than Thom Hartmann to ask, quite simply, “Is polling dead?”
What has occasioned this particular bash-fest is a seeming collision of two trains — a poll train and a vote train. The first train came in the form of a late-October Times/Siena poll, released on November 5, that found Donald Trump beating Joe Biden by meaningful margins in five of six swing states surveyed. The poll also showed a growing and potentially fatal weakness for Biden among key Democratic constituencies: voters of color and the young.
These results understandably sent shock waves through the Democratic Party, as well as through the tens of millions of Americans who have come to regard Trump as a deadly threat to democracy.
Everyone I spoke to was tied up in semiotic knots trying to wrap their heads around the surrealistic reality that Trump — Trump! — was suddenly odds-on to reclaim the presidency next year. And that voters overwhelmingly trusted Trump more than Biden on the economy, immigration, and national security. How could the electorate — at least that part of it residing in the election-deciding swing states — be so, well, some of the phrases I heard were “deluded” … “out of its mind” … “hypnotized” … and “stupid?”
No, impossible! That poll had to be wrong! Garbage! Crap! Who did they ask, a roomful of MAGAs? They just called landlines! A bunch of hayseeds named Earl and Mabel answered their rotary phones and said they’re for Trump! Anyone under 60 with half a brain has a cell phone and the poll missed all of them! I’ve never been polled; I’ll bet it’s all a big hoax! And so on and so forth.
Then, two days after the poll was made public, the other train came through — last Tuesday’s general election, a big night for Democrats in Ohio, Virginia, Kentucky, New Jersey, and elsewhere around the country — and the outraged poll-bashers had their apparent vindication. See, we told you, just look at the votes! Forget the polls — they’re meaningless! Garbage! Crap! Etc!
The more I listened, the more it came through how ill-informed so much of this discussion was — how little most Americans, even many in the media, understand about how polling really works and what factors actually contribute to and detract from its reliability.
I want to offer as clear an explanation as I can of a complex process, in the hope that a better understanding of how today’s polling works, and what it can and cannot tell us, will serve us well in the coming year’s trip through the pre-electoral Fun House.
Accordingly, this will be the first in a series of columns examining different aspects of the polling process. Today I will discuss the pre-election, or tracking, polling that we’ll be seeing so much of over the next year. Subsequent pieces will extend that discussion and look specifically at exit polling, at a phenomenon I refer to as “an insidious poll/vote count feedback loop,” and at what a world with no polling would look like.
My Life as a Pollster
Let me begin by stating that, fresh out of college during the Carter presidency, I served as an analyst for a major Washington political survey research shop. My duties involved the drawing of samples, construction of questionnaires, and political interpretation of the numbers that emerged from them — so I had a great view of the sausage being made, every step in the process.
We sent interviewers out into the field to knock on doors, and used the White Pages and a very simple iterative spacing algorithm for the other surveys, which we conducted by phone. Results came in the form of 20-lb binders of 16-column printouts. We needed big desks. Computers had a backstage role but virtually all of our analysis was manual and labor-intensive.
Polling has evolved significantly during the intervening years, mainly in response to advancing technologies — which have both posed new challenges and provided new tools. In my 20-plus years of election forensics work, I have kept up with these developments through continuing contact with my colleagues in the field.
The Strange Powers of Sampling
Statistics and probability, the branches of mathematics on which polling is based, can often be counterintuitive. Probably the most counterintuitive aspect of polling is that a very modest sample can very accurately represent a very large whole.
Specifically, when dealing with an election, the candidate preferences of, say, a million voters can be represented, within an expected error margin of just 3 percent, by a sample of about a thousand (1,066) of them, or only 0.1 percent of the whole. Even more counterintuitive: If you were trying to assess the preferences of 100 million voters, adding just two more respondents to your sample of 1,066 would produce the same 3 percent level of accuracy!
Call BS if you must, but I swear that’s the way it works. The math is trippy but not particularly complicated. For anyone who doesn’t want to delve into the equations, there are fun and friendly tools for instantly calculating sample sizes of any population to any desired degree of accuracy; I hope you’ll take a few minutes to plug in some numbers and play!
If you do, you’ll discover there’s an equally strange flip side to sampling: If you wanted to take those same million voters and predict their votes to within 1 percent rather than 3 percent, with the same degree of confidence, your sample would have to be nearly 10 times as large (9,513).
So bigger populations barely move the sample-size needle while greater precision moves it a lot, but the bottom line is that polling is an extraordinarily powerful, efficient, and indispensable tool for “reading the room” when the “room,” like a state or a country, has millions of people living in it.
Marbles vs. Voters
Now if all this sounds just a little too good to be true, that’s because it is (but just a little). Why? Because the numbers above — and all the numbers that pop up when you use the handy sample-size tool — are predicated on a strictly random selection process. They yield what I like to refer to as an “ideal” margin of error, arrived at by a simple mathematical formula and calculation.
It would work perfectly well if you were dealing with, say, a giant box containing a million marbles — some number of them blue, the rest red, and the whole box thoroughly shaken and mixed. If you put on a blindfold and picked out 1,066 marbles and counted the reds and blues, that would tell you quite reliably, within 3 percent, the proportion of red and blue marbles in the whole box.
We know, however, that it is virtually impossible to draw a true random sample of a given electorate. Real-world factors like accessibility and nonresponse biases put the kibosh on strict randomness and inevitably degrade the accuracy of results. Even worse, they do so in a manner that is nonuniform and difficult to quantify. And, to put the final touch on the pollster’s nightmare, both accessibility and nonresponse problems have gotten dramatically worse with the evolving technological and political environments.
In very simple terms, the cell phone has largely displaced the landline; people screen and don’t answer most calls from unrecognized numbers; different age, racial, income, and partisanship groups demonstrate differing propensity to answer, cooperate, and complete the survey when the pollster calls; and some portion of voters who don’t respond will vote, while some portion of voters who do respond won’t vote. What a mess!
Stratification to the Rescue
The good news is that the polling industry has developed tools to deal with these problems and the most well funded and professional outfits have settled on something like an industry-standard approach to applying them.
Here’s as good a place as any to admit that there’s a fair amount of art that goes with the science in polling. But also to acknowledge that the pollsters have gotten pretty good at the art.
The key to solving the problems of accessibility and variability of response levels lies in the weighting — the term of art is “stratification” — of the samples. At the risk of oversimplifying, the process can be boiled down to the pollster’s best guess about what will be the ultimate demographic and political composition of the electorate.
Of course the word “guess” is not terribly reassuring — in fact, I wouldn’t blame you if you think it sounds downright fishy. So I will upgrade the characterization from guess to estimate, which sounds more encouraging.
True, there is some guesswork in predicting the demographic/political contours of the “actual” electorate. But that guesswork is data-rich and highly informed, drawing on census and voter registration data, vote counts from prior elections, comparisons of those vote counts with associated polling results, and sophisticated trend and pattern analyses of all this information. The industry, whose business model is predicated on “getting elections right,” is fanatical about error correction — which operates something like an AI learning process.
The “Trust” Factor
Perhaps the best way to illustrate how this all works in practice is to offer an example. One major suspected source of nonresponse bias is that the level of trust in “institutions,” which can include polling outfits, is significantly correlated with political persuasion.
Completing a survey full of politically charged questions is, in a sense, an act of trust. If MAGA/Republican voters are more apt to regard someone (assumed to be from “the radical-leftist media”) asking such questions as intrusive, suspect, perhaps even a threat to their “freedom,” it follows that they will be more apt to refuse to respond to a survey when chosen.
If unaddressed — i.e., if an initially random sample wound up including just the surveys actually completed, without weighting — the disproportionately distrustful MAGA/Republican voters would wind up significantly underrepresented. And since that partisan affiliation correlates very strongly with candidate preference, Trump and other Republican candidates would wind up doing much better in the election than predicted by such polls — a phenomenon known as the red shift, a term I coined for it back in 2004.
The solution lies in either oversampling such an underrepresented cohort — in this case, the respondents who identified as Republican — or, more conveniently, just upweighting the responses of those Republicans who did complete their surveys.
So if, say, in a given state, 160 of 1,000 (16 percent) completed surveys indicated Republican affiliation, and your data-rich models indicated that the Republican voters actually constituted 32 percent of the electorate, you’d upweight each survey (i.e., all the responses on it) from a Republican-identified respondent by a factor of 2x. Their “votes” would count double because each respondent was essentially answering for two categorically similar respondents, one of whom had balked at taking the survey.
Of course the actual numbers are less “round” but the principle holds. Moreover, it can be and is applied to a whole set of other demographic and political characteristics. For instance, it might be more difficult to access inner-city Black voters, young voters living in college dorms, or rural voters lacking internet service. All these groups would be upweighted accordingly. Groups that were easier to access or especially eager to respond would, conversely, be down-weighted. (If you want to take a look at a real-world example of the whole process, here’s a link to the crosstabs of the recent, controversial Times/Siena swing states poll — scroll almost to the bottom for the Methodology section.)
Ginning Up a Representative Sample
This may all sound a bit like “the secret’s in the sauce,” but it does work. The real challenge is not in the math, which computers can handle without breaking a sweat, but in figuring out just what size slice of the eventual electorate pie each group will wind up being: Will the electorate be 11 percent Black or 12 percent; 35 percent independents or 33 percent; 15 percent 18-to-25s or 14 percent; etc.? It matters, especially for those groups that are highly correlated with candidate choice — party ID and race being classic examples.
But this is where the technological advances that enable big-data number crunching and pattern analysis compensate for the other technological advances (and political developments) that have posed new challenges for pollsters. It is possible to make deeply informed estimates of the ultimate composition of a given electorate, which then determine the weightings of the sample and, to a large extent, the accuracy of a poll’s prediction. It is possible, in effect, to synthesize a representative sample.
[Listen to this recent WhoWhatWhy podcast for the views of a current leader in the field.]
Considering all this, I think it is fair to say that polling is hardly the “disaster” that the bashers have made it out to be. At the same time, one would be very foolish to take the results of any individual poll as gospel.
Picking Out the Good Ones
To begin with, there are polls and there are polls. Just about anyone can float some questions on social media and call it a “poll.” Of course there’s no randomness, no statistical power: The respondents are all self-selected and therefore comically unrepresentative of any conceivable electorate as a whole.
And then there are agenda-driven organizations and politically biased pollsters whose principal aim is to impact the electoral process by feeding it false information — i.e., inaccurate polling results, generally favoring their dog in the race. The web is swimming with polling garbage and caveat emptor is highly advised.
On the other hand, there are established polling outfits whose reputations have been built on sound (and disclosed) methodologies along with consistently accurate results over time.
Even then, an individual poll may miss. Polling shit happens. But when polls are aggregated, graded, and weighted (so the outliers and partisan push-polls are discounted) by outfits such as FiveThirtyEight or RealClearPolitics, the resulting weighted averages are generally highly reliable. They may be a snapshot or a trend, and things may change. But you ignore this trove of information at your peril.
Applied to the current brouhaha — the poll/election trainwreck with which I began this inquiry — the polling for the scattered contests of this year’s off-off-year election turned out to be quite accurate. They predicted the narrow Democratic wins in Virginia, passage of the reproductive-rights constitutional amendment in Ohio, Democratic Gov. Andy Beshear’s reelection in Kentucky, etc., with high fidelity to actual results.
Which leaves the Times/Siena poll, showing shocking Trump strength and Biden weakness in the crucial swing states. It’s an individual poll (multistate but subject to a consistent methodology) so, yes, a grain of salt. But it should be evident that the “bad” poll and the “good” election results are neither contradictory nor incompatible. Democratic strength (fueled in large part by abortion-related backlash) and Biden weakness (mystifying as it may seem) are not mutually exclusive.
The dark mood of much of the electorate — which Trump has been relentless in fostering, going all the way back to his incendiary 2015 campaign-launching “rapists, and some, I assume, are good people” escalator speech — is a reality. And unfortunately for Biden, and perhaps Democrats more generally going into 2024, incumbent presidents and their party tend to be the prime targets of such discontent, however it has been stoked. That seems to be the import of the Times/Siena poll, and the message has also reverberated through other polling over the last few months.
While the poll coupled with the election results predictably increased the pressure on Biden to step aside, he has plenty of time to up his game (or “messaging”) and Trump may yet find himself a candidate-convict lead balloon. There’s maximal uncertainty — deliciously fascinating to some, overwhelmingly stressful to others, likely ignored by most — about how 2024 will shake out and who will be in power once the dust settles (assuming it does settle) a year from now.
Polls, for better or worse and for all their sure-to-be-noted flaws, will be our guide through the billowing fog.
Next — Polling 202: Other polling concerns and the insidious poll/vote count feedback loop.
Jonathan D. Simon is a senior editor at WhoWhatWhy and author of CODE RED: Computerized Elections and the War on American Democracy.
WhoWhatWhy values our readers’ input and encourages you to drop us a note with your thoughts on this article at MyView@whowhatwhy.org.