Why the Polls Were Wrong

It’s not good news. From Vox:

What the hell happened with the polls this year?

Yes, the polls correctly predicted that Joe Biden would win the presidency. But they got all kinds of details, and a number of Senate races, badly wrong . . .

To try to make sense of the massive failure of polling this year, I reached out to the smartest polling guy I know: David Shor, an independent data analyst who’s a veteran of the Obama presidential campaigns who formerly operated a massive web-based survey at Civis Analytics before leaving earlier this year. . . . Shor’s been trying to sell me, and basically anyone else who’ll listen, on a particular theory of what went wrong in polling that year, and what he thinks went wrong with polling in 2018 and 2020, too.

The theory is that the kind of people who answer polls are systematically different from the kind of people who refuse to answer polls — and that this has recently begun biasing the polls in a systematic way.

This challenges a core premise of polling, which is that you can use the responses of poll takers to infer the views of the population at large — and that if there are differences between poll takers and non-poll takers, they can be statistically “controlled” for by weighting according to race, education, gender, and so forth. . . . If these two groups do differ systematically, that means the results are biased.

The assumption that poll respondents and non-respondents are basically similar, once properly weighted, used to be roughly right — and then, starting in 2016, it became very, very wrong [note: of course, 2016 was when Txxxx began poisoning the political process]. People who don’t answer polls, Shor argues, tend to have low levels of trust in other people more generally. These low-trust folks used to vote similarly to everyone else. But as of 2016, they don’t: they tend to vote for Republicans.

Now, in 2020, Shor argues that the differences between poll respondents and non-respondents have gotten larger still. In part due to Covid-19 stir-craziness, Democrats, and particularly highly civically engaged Democrats who donate to and volunteer for campaigns, have become likelier to answer polls. It’s something to do when we’re all bored, and it feels civically useful. This biased the polls, Shor argues, in deep ways that even the best polls (including his own) struggled to account for.

Liberal Democrats answered more polls, so the polls overrepresented liberal Democrats and their views (even after weighting), and thus the polls gave Biden and Senate Democrats inflated odds of winning. . . .

Dylan Matthews

So, David: What the hell happened with the polls this year?

David Shor

So the basic story is that, particularly after Covid-19, Democrats got extremely excited and had very high rates of engagement. They were donating at higher rates, etc., and this translated to them also taking surveys, because they were locked at home and didn’t have anything else to do. There’s some pretty clear evidence that that’s nearly all of it: It was partisan non-response. Democrats just started taking a bunch of surveys [when they were called by pollsters, while Republicans did not].. . .

Dylan Matthews

You mentioned social trust. Walk me through your basic theory about how people who agree to take surveys have higher levels of social trust, and how that has biased the polls in recent years.

David Shor

For three cycles in a row, there’s been this consistent pattern of pollsters overestimating Democratic support in some states and underestimating support in other states. This has been pretty consistent. It happened in 2018. It happened in 2020. And the reason that’s happening is because the way that [pollsters] are doing polling right now just doesn’t work. . . .

Fundamentally, every “high-quality public pollster” does random digit dialing. They call a bunch of random numbers, roughly 1 percent of people pick up the phone, and then they ask stuff like education, and age, and race, and gender, sometimes household size. And then they weight it up to the census, because the census says how many adults do all of those things. That works if people who answer surveys are the same as people who don’t, once you control for age and race and gender and all this other stuff.

But it turns out that people who answer surveys are really weird. They’re considerably more politically engaged than normal. . . . [They] have much higher agreeableness [a measure of how cooperative and warm people are], which makes sense, if you think about literally what’s happening.

They also have higher levels of social trust. . . . It’s a pretty massive gap. [Sociologist] Robert Putnam actually did some research on this, but people who don’t trust people and don’t trust institutions are way less likely to answer phone surveys. Unsurprising! This has always been true. It just used to not matter.

It used to be that once you control for age and race and gender and education, that people who trusted their neighbors basically voted the same as people who didn’t trust their neighbors. But then, starting in 2016, suddenly that shifted. . . . These low-trust people still vote, even if they’re not answering these phone surveys.

Dylan Matthews

So that’s 2016. Same story in 2018 and 2020?

David Shor

The same biases happened again in 2018, which people didn’t notice because Democrats won anyway. What’s different about this cycle is that in 2016 and 2018, the national polls were basically right. This time, we’ll see when all the ballots get counted, but the national polls were pretty wrong. If you look at why, I think the answer is related, which is that people who answer phone surveys are considerably more politically engaged than the overall population. . . .

Normally that doesn’t matter, because political engagement is actually not super correlated with partisanship. That is normally true, and if it wasn’t, polling would totally break. In 2020, they broke. There were very, very high levels of political engagement by liberals during Covid. You can see in the data it really happened around March. Democrats’ public Senate polling started surging in March. Liberals were cooped up, because of Covid, and so they started answering surveys more and being more engaged.

This gets to something that’s really scary about polling, which is that polling is fundamentally built on this assumption that people who answer surveys are the same as people who don’t, once you condition on enough things. . . . But these things that we’re trying to measure are constantly changing. And so you can have a method that worked in past cycles suddenly break. . . .

There used to be a world where polling involved calling people, applying classical statistical adjustments, and putting most of the emphasis on interpretation. Now you need voter files and proprietary first-party data and teams of machine learning engineers. It’s become a much harder problem.

Dylan Matthews

. . . Pollsters need to get way more sophisticated in their quantitative methods to overcome the biases that wrecked the polls this year. Am I understanding that right?

David Shor

. . . A lot of people think that the reason why polls were wrong was because of “shy Txxxx voters.” You talk to someone, they say they’re undecided, or they say they’re gonna vote for Biden, but it wasn’t real. Then, maybe if you had a focus group, they’d say, “I’m voting for Biden, but I don’t know.” And then your ethnographer could read the uncertainty and decide, “Okay, this isn’t really a firm Biden voter.” That kind of thing is very trendy as an explanation.

But it’s not why the polls were wrong. It just isn’t. People tell the truth when you ask them who they’re voting for. They really do, on average. The reason why the polls are wrong is because the people who were answering these surveys were the wrong people. If you do your ethnographic research, if you try to recruit these focus groups, you’re going to have the same biases. They recruit focus groups by calling people! Survey takers are weird. People in focus groups are even weirder. Qualitative research doesn’t solve the problem of one group of people being really, really excited to share their opinions, while another group isn’t. As long as that bias exists, it’ll percolate down to whatever you do.

Unquote.

I think what this means is that even if you correct for the low number of Republicans who answer polls, you’re still in trouble, because you’re still polling the kind of Republicans who answer polls (the relatively nice ones). Your sample of Republican voters doesn’t represent Republicans who vote.

This does not bode well for the two Senate races in Georgia. Polls show the Democrats are a few points behind. When the run-off elections occur in January, the polls will probably be wrong again, unless the pollsters have quickly figured out how to solve this problem.

Our COVID Mortality Rate Is Down 85%?

That’s a statistic in the news, but what does it mean? CNN reported this on Friday:

Friday’s case count of at least 80,005 surpasses the country’s previous one-day high of 77,362, reported July 16, according to Johns Hopkins University.
 
US Surgeon General Dr. Jerome Adams cautioned earlier Friday that hospitalizations are starting to go up in 75% of the jurisdictions across the country, and officials are concerned that in a few weeks, deaths will also start to increase.
 
The good news, Adams said, is that the country’s Covid-19 mortality rate has decreased by about 85% thanks to multiple factors, including the use of remdesivir, steroids and better management of patients.
 
According to CNN, the Surgeon General  made these remarks during an online discussion of global health policy at Meridian Global Leadership Summit. Meridian is a “non-profit, non-partisan diplomacy center” in Washington. I couldn’t find exactly what he said, either from Meridian’s site, the Surgeon General’s site or the Surgeon General’s Twitter account. The Center for Disease Control doesn’t seem to report a mortality rate.
 
Looking at statistics from The New York Times, however, indicates what the Surgeon General was talking about. Back in mid-April, the US was reporting around 2,200 deaths for every 32,000 confirmed cases. Now 800 deaths are being reported for around 68,000 cases. That translates into 6.9% of cases ending in death in April vs. 1.2% now, a decline of 83%. So it’s true that the mortality rate has dropped quite a lot.
 
This is confirmed by two studies reported by National Public Radio:
 
Two new peer-reviewed studies are showing a sharp drop in mortality among hospitalized COVID-19 patients. The drop is seen in all groups, including older patients and those with underlying conditions, suggesting that physicians are getting better at helping patients survive their illness.
 
The article mentions other reasons the mortality rate may be dropping:
 
[Researchers] say that factors outside of doctors’ control are also playing a role in driving down mortality. . . . Mask-wearing may be helping by reducing the initial dose of virus a person receives, thereby lessening the overall severity of illness for many patients. . . . Keeping hospitals below their maximum capacity also helps to increase survival rates. When cases surge and hospitals fill up, “staff are stretched, mistakes are made, it’s no one’s fault — it’s that the system isn’t built to operate near 100%”. . . 
 
This hardly means we’ve turned the corner on COVID-19, as one of the presidential candidates claims. A mortality rate of 7% is still high relative to other diseases. Serious illness is never a joy and even patients who survive COVID-19 sometimes suffer long-term effects.
 
In addition, two other numbers recently reported aren’t encouraging. The pandemic is causing significantly more deaths, either directly or indirectly, than are being reported:
 
In the most updated count to date, researchers at the Centers for Disease Control and Prevention have found that nearly 300,000 more people in the United States died from late January to early October this year compared to the average number of people who died in recent years. Just two-thirds of those deaths were counted as Covid-19 fatalities, highlighting how the official U.S. death count — now standing at about 220,000 [or 225,000] — is not fully inclusive [Stat].
 
One model predicts that the next four months will be especially bad in the US:
 
More than 511,000 lives could be lost by 28 February next year, modeling led by scientists from the University of Washington found.

This means that with cases surging in many states, particularly the upper Midwest, what appears to be a third major peak of coronavirus infections in the US could lead to nearly 300,000 people dying in just the next four months.

In fact the University of Washington warned that the situation will be even more disastrous if states continue to ease off on measures designed to restrict the spread of the virus, such as the shuttering of certain businesses and social distancing edicts. If states wind down such protections, the death toll could top 1 million people in America by 28 February, the UW study found [The Guardian].

Finally, the presidential candidate who doesn’t think we’ve turned the corner offered this timely reminder:

President Txxxx’s plan to beat COVID-19

Nine days.

2020 Won’t Be 2016 (or 2000)

We’re entering what’s been called and what’s going to be “the longest two weeks in human history”. A neuroscientist who writes for Scientific American says we shouldn’t worry too much about what’s going to happen:

Will we be surprised again this November the way Americans were on Nov. 9, 2016 when they awoke to learn that reality TV star Dxxxx Txxxx had been elected president?

. . . Another surprise victory is unlikely to happen again if this election is looked at from the same perspective of neuroscience that I used to account for the surprising outcome in 2016. Briefly, that article explained how our brain provides two different mechanisms of decision-making; one is conscious and deliberative, and the other is automatic, driven by emotion and especially by fear.

Txxxx’s strategy does not target the neural circuitry of reason in the cerebral cortex; it provokes the limbic system. In the 2016 election, undecided voters were influenced by the brain’s fear-driven impulses—more simply, gut instinct—once they arrived inside the voting booth, even though they were unable to explain their decision to pre-election pollsters in a carefully reasoned manner.

In 2020, Txxxx continues to use the same strategy of appealing to the brain’s threat-detection circuitry and emotion-based decision process to attract votes and vilify opponents. . . .

But fear-driven appeals will likely persuade fewer voters this time, because we overcome fear in two ways: by reason and experience. Inhibitory neural pathways from the prefrontal cortex to the limbic system will enable reason to quash fear if the dangers are not grounded in fact. . . .

A psychology- and neuroscience-based perspective also illuminates Txxxx’s constant interruptions and insults during the first presidential debate, steamrolling over the moderator’s futile efforts to have a reasoned airing of facts and positions. The structure of a debate is designed to engage the deliberative reasoning in the brain’s cerebral cortex, so Txxxx annihilated the format to inflame emotion in the limbic system.

Txxxx’s dismissal of experts, be they military generals, career public servants, scientists or even his own political appointees, is necessary for him to sustain the subcortical decision-making in voters’ minds that won him election and sustains his support. . . . In his rhetoric, Txxxx does not address factual evidence; he dismisses or suppresses it even for events that are apparent to many, including global warming, foreign intervention in U.S. elections, the trivial head count at his inauguration, and even the projected path of a destructive hurricane. Instead, “alternative facts” or fabrications are substituted.

. . . Reason cannot always overcome fear, as [Post-Traumatic Stress Disorder] demonstrates; but the brain’s second mechanism of neutralizing its fear circuitry—experience—can do so. Repeated exposure to the fearful situation where the outcome is safe will rewire the brain’s subcortical circuitry. This is the basis for “extinction therapy” used to treat PTSD and phobias. For many, credibility has been eroded by Txxxx’s outlandish assertions, like suggesting injections of bleach might cure COVID-19, or enthusing over a plant toxin touted by a pillow salesman, while scientific experts in attendance grimace and bite their lips.

In the last election Txxxx was a little-known newcomer as a political figure, but that is not the case this time with either candidate. The “gut -reaction” decision-making process excels in complex situations where there is not enough factual information or time to make a reasoned decision. We follow gut instinct, for example, when selecting a dish from a menu at a new restaurant, where we have never seen or tasted the offering before. We’ve had our fill of the politics this time, no matter what position one may favor. Whether voters choose to vote for Txxxx on the basis of emotion or reason, they will be better able to articulate the reasons, or rationalizations, for their choice. This should give pollsters better data to make a more accurate prediction.

Unquote.

Pollsters did make an accurate prediction of the national vote in 2016 (Clinton won it). Most of them didn’t taken into account the Electoral College, however, or anticipate the last-minute intervention by big-mouth FBI Director James Comey.

In 2000, the Electoral College result depended on an extremely close election in one state. That allowed the Republicans on the Supreme Court to get involved. There’s no reason to think that will happen again, despite the president’s hopes that it will.

Quantum Reality by Jim Baggott

The author is a former academic physicist with a leaning toward the experimental side of physics, as opposed to the theoretical side. From the preamble:

I know why you’re here.

You know that quantum mechanics is an extraordinarily successful scientific theory, on which much of out modern, tech-obsessed lifestyles depend. . . .You also know that it is completely mad. Its discovery forced open the window on all those comfortable notions we had gathered about physical reality . . . and shoved them out. Although quantum mechanics quite obviously works, it appears to leave us chasing ghosts and phantoms, particles that are waves and waves that are particles, cats that are at once both alive and dead, lots of seemingly spooky goings-on, and a desperate desire to lie down quietly in a darkened room.

But, hold on, if we’re prepared to be a little more specific about what we mean when we talk about “reality” and a little more circumspect about how we think a scientific theory might represent such a reality, then all the mystery goes away [Note: not really] . . . 

But . . . a book that says, “Honestly, there is no mystery” would . . . be completely untrue. For sure we can rid ourselves of all the mystery in quantum mechanics, but only by abandoning any hope of deepening our understanding of nature. We must become content to use the quantum representation simply as a way to perform calculations and make predictions, and we must resist the temptation to ask: But how does nature actually do that? And there lies the rub: for what is the purpose of a scientific theory if not to aid our understanding of the physical world.

. . . The choice we face is a philosophical one. There is absolutely nothing scientifically wrong with a depressingly sane interpretation of quantum mechanics in which there is no mystery. If we choose instead to pull on the loose thread, we are inevitably obliged to take the quantum representation at face value, and interpret its concepts rather more literally. Surprise, surprise, The fabric unravels to give us all those things about the quantum world that we find utterly baffling, and we’re right back where we started.

My purpose in this book is (hopefully) . . . to try to explain what it is about quantum mechanics that forces us to confront this kind of choice, and why this is entirely philosophical in nature. Making different choices leads to different interpretations or even modifications of the quantum representation and its concepts, in what I call . . . the game of theories.

Mr. Baggott follows the usual path that includes the work of Einstein and Niels Bohr and Erwin Schrödinger and ends with various theories of the multiverse. He lost me around page 160 in chapter 7. Up until then, I felt like I was understanding almost everything. Given the nature of quantum mechanics, that probably meant I was deeply confused. After that, my confusion was obvious.

He does make clear how anyone trying to understand the reality behind quantum mechanics, or to “interpret” it, ends up veering into philosophical speculation. His strong preference is for interpretations that can be tested empirically. That’s one reason he’s skeptical about multiverse theories, which don’t seem to be testable at all.

I’m glad I read the book, but I could have jumped from chapter 7 to the Epilogue, which is entitled “I’ve Got a Very Bad Feeling About This”:

I hope I’ve done enough in this book to explain the nature of our dilemma. We can adopt an anti-realist interpretation in which all our conceptual problems vanish, but which obliges us to accept that we’ve reached the limit or our ability to access deeper truths about a reality of things-in-themselves. The anti-realist interpretations tell us that there’s nothing to see here. Of necessity, they offer no hints as to where we might look to gain some new insights of understanding. They are passive; mute witnesses to the inscrutability of nature.

In contrast, the simpler and more palatable realist interpretations based on local or crypto-local hidden variables offered plenty of hints and continue to motivate ever more exquisitely subtle experiments. Alas, the evidence is now quite overwhelming and all but the most stubborn of physicists accept that nature denies us this easy way our. If we prefer a realist interpretation, taking the wavefunction and the conceptual problems this implies at face value, then we’re left with what I can only call a choice between unpalatable evils. We can choose de Broglie-Bohm theory and accept non-local spooky action at a distance. We can choose to add a rather ad hoc spontaneous collapse mechanism and hope for the best. We can choose to involve consciousness in the mix, conflating one seemingly intractable problem with another. Or we can choose Everett, many worlds and the multiverse. . . . 

There may be another way out. I’m pretty confident that quantum mechanics is not the end. Despite its unparalleled success, we know it doesn’t incorporate space and time in the right way [it seems to presume absolute space and absolute simultaneity, not Einstein’s relative spacetime]. . . . It may well be that any theory that transcends quantum mechanics will still be rife with conceptual problems and philosophical conundrums. But it would be nice to discover that, despite appearances to the contrary, there was indeed something more to see here.

That’s the end of the book. 

I got a copy of Quantum Reality after reading a very positive review by another physicist, Sabine Hossenfelder. She said it’s “engagingly written” and requires “no background knowledge in physics”. Maybe not, but a background would help, especially when you get to chapter 7.

I did acquire one idea, which fits with an idea I already had. It seems that the famous two-slit experiment, in which a single photon appears to take multiple paths, has a simple solution. When the photon is sent on its way, it’s a wave. It passes through both slits at the same time. Then, when it hits the screen on the other side of the two slits, it becomes a particle. Maybe this is the de Broglie-Bohm theory referred to above, which implies “spooky action at a distance”. But it sounds plausible to me.

The wave instantaneously becoming a particle seems (to me) to fit with the way entangled particles simultaneously adopt opposing characteristics. One is measured and found to be “up”, which means the other instantly becomes “down”, no matter how far away the two particles are. This suggests that spacetime isn’t fundamental. The distance we perceive as being far too great for two particles to immediately affect each other isn’t the fundamental reality. There’s something going on that’s deeper than spacetime. So the way in which a wave that’s spread out simultaneously disappears, resulting in a single particle hitting a screen, reveals the same thing.

So I feel like I’m making a bit of progress in understanding physics. This is most likely incorrect, but it makes me feel better. Now all I have to do is figure out why physicists claim we couldn’t find the location of the Big Bang. Sure, space is expanding in all directions from the Big Bang, they say, but they deny the universe has a center, where the Big Bang occurred (it would make a great location for a museum and a gift shop). I don’t understand their reasons for saying there is no center.

But one small, confused step at a time.

An Overlooked Variable May Be the Key to the Pandemic

A writer for The Atlantic argues that there’s “a potential, overlooked way of understanding this pandemic that would help answer [questions about it], reshuffle many of the current heated arguments, and, crucially, help us get the spread of COVID-19 under control”:

By now many people have heard about R0—the basic reproductive number of a pathogen, a measure of its contagiousness on average. But unless you’ve been reading scientific journals, you’re less likely to have encountered k, the measure of its dispersion. The definition of k is a mouthful, but it’s simply a way of asking whether a virus spreads in a steady manner or in big bursts, whereby one person infects many, all at once. After nine months of collecting epidemiological data, we know that this is an overdispersed pathogen, meaning that it tends to spread in clusters, but this knowledge has not yet fully entered our way of thinking about the pandemic—or our preventive practices.

The now-famed R0 (pronounced as “r-naught”) is an average measure of a pathogen’s contagiousness, or the mean number of susceptible people expected to become infected after being exposed to a person with the disease. If one ill person infects three others on average, the R0 is three. This parameter has been widely touted as a key factor in understanding how the pandemic operates. News media have produced multiple explainers and visualizations for it. . . . . Dashboards track its real-time evolution, often referred to as R or Rt, in response to our interventions. . .

Unfortunately, averages aren’t always useful for understanding the distribution of a phenomenon, especially if it has widely varying behavior. If Amazon’s CEO, Jeff Bezos, walks into a bar with 100 regular people in it, the average wealth in that bar suddenly exceeds $1 billion. . . .Clearly, the average is not that useful a number to understand the distribution of wealth in that bar, or how to change it. . . . Meanwhile, if the bar has a person infected with COVID-19, and if it is also poorly ventilated and loud, causing people to speak loudly at close range, almost everyone in the room could potentially be infected—a pattern that’s been observed many times since the pandemic begin, and that is similarly not captured by R. That’s where the dispersion comes in.

There are COVID-19 incidents in which a single person likely infected 80 percent or more of the people in the room in just a few hours. But, at other times, COVID-19 can be surprisingly much less contagious. Overdispersion and super-spreading of this virus are found in research across the globe. A growing number of studies estimate that a majority of infected people may not infect a single other person. A recent paper found that in Hong Kong, which had extensive testing and contact tracing, about 19 percent of cases were responsible for 80 percent of transmission, while 69 percent of cases did not infect another person.

This finding is not rare: Multiple studies from the beginning have suggested that as few as 10 to 20 percent of infected people may be responsible for as much as 80 to 90 percent of transmission, and that many people barely transmit it.

This highly skewed, imbalanced distribution means that an early run of bad luck with a few super-spreading events, or clusters, can produce dramatically different outcomes even for otherwise similar countries. Scientists looked globally at known early-introduction events, in which an infected person comes into a country, and found that in some places, such imported cases led to no deaths or known infections, while in others, they sparked sizable outbreaks. . . . In Daegu, South Korea, just one woman, dubbed Patient 31, generated more than 5,000 known cases in a megachurch cluster.

Unsurprisingly, SARS-CoV, the previous incarnation of SARS-CoV-2 that caused the 2003 SARS outbreak, was also overdispersed in this way: The majority of infected people did not transmit it, but a few super-spreading events caused most of the outbreaks. MERS, another coronavirus cousin of SARS, also appears overdispersed, but luckily, it does not—yet—transmit well among humans.

This kind of behavior, alternating between being super infectious and fairly noninfectious, is exactly what k captures, and what focusing solely on R hides. . . .

Nature and society are replete with such imbalanced phenomena, some of which are said to work according to the Pareto principle, named after the sociologist Vilfredo Pareto. Pareto’s insight is sometimes called the 80/20 principle—80 percent of outcomes of interest are caused by 20 percent of inputs—though the numbers don’t have to be that strict. Rather, the Pareto principle means that a small number of events or people are responsible for the majority of consequences. This will come as no surprise to anyone who has worked in the service sector, for example, where a small group of problem customers can create almost all the extra work. . . .

To fight a super-spreading disease effectively, policy makers need to figure out why super-spreading happens, and they need to understand how it affects everything, including our contact-tracing methods and our testing regimes.

There may be many different reasons a pathogen super-spreads. Yellow fever spreads mainly via the mosquito Aedes aegypti, but until the insect’s role was discovered, its transmission pattern bedeviled many scientists. . . . Much is still unknown about the super-spreading of SARS-CoV-2. It might be that some people are super-emitters of the virus, in that they spread it a lot more than other people. . . .

In study after study, we see that super-spreading clusters of COVID-19 almost overwhelmingly occur in poorly ventilated, indoor environments where many people congregate over time—weddings, churches, choirs, gyms, funerals, restaurants, and such—especially when there is loud talking or singing without masks. For super-spreading events to occur, multiple things have to be happening at the same time, and the risk is not equal in every setting and activity. . . .

[Muge Cevik of the University of St. Andrews] identifies “prolonged contact, poor ventilation, [a] highly infectious person, [and] crowding” as the key elements for a super-spreader event. Super-spreading can also occur indoors beyond the six-feet guideline, because SARS-CoV-2, the pathogen causing COVID-19, can travel through the air and accumulate, especially if ventilation is poor. Given that some people infect others before they show symptoms, or when they have very mild or even no symptoms, it’s not always possible to know if we are highly infectious ourselves. We don’t even know if there are more factors yet to be discovered that influence super-spreading.

But we don’t need to know all the sufficient factors that go into a super-spreading event to avoid what seems to be a necessary condition most of the time: many people, especially in a poorly ventilated indoor setting, and especially not wearing masks. As Natalie Dean, a biostatistician at the University of Florida, told me, given the huge numbers associated with these clusters, targeting them would be very effective in getting our transmission numbers down.

Overdispersion should also inform our contact-tracing efforts. In fact, we may need to turn them upside down. Right now, many states and nations engage in what is called forward or prospective contact tracing. Once an infected person is identified, we try to find out with whom they interacted afterward so that we can warn, test, isolate, and quarantine these potential exposures. But that’s not the only way to trace contacts. And, because of overdispersion, it’s not necessarily where the most bang for the buck lies. Instead, in many cases, we should try to work backwards to see who first infected the subject.

Because of overdispersion, most people will have been infected by someone who also infected other people, because only a small percentage of people infect many at a time, whereas most infect zero or maybe one person. As Adam Kucharski, an epidemiologist, . . . explained to me, if we can use retrospective contact tracing to find the person who infected our patient, and then trace the forward contacts of the infecting person, we are generally going to find a lot more cases compared with forward-tracing contacts of the infected patient. [Those] will merely identify potential exposures, many of which will not happen anyway, because most transmission chains die out on their own. . . .

Even in an overdispersed pandemic, it’s not pointless to do forward tracing to be able to warn and test people, if there are extra resources and testing capacity. But it doesn’t make sense to do forward tracing while not devoting enough resources to backward tracing and finding clusters, which cause so much damage. . . .

Perhaps one of the most interesting cases has been Japan, a country with middling luck that got hit early on and followed what appeared to be an unconventional model, not deploying mass testing and never fully shutting down. By the end of March, influential economists were publishing reports with dire warnings, predicting overloads in the hospital system and huge spikes in deaths. The predicted catastrophe never came to be, however, and although the country faced some future waves, there was never a large spike in deaths despite its aging population, uninterrupted use of mass transportation, dense cities, and lack of a formal lockdown.

[Hitoshi Oshitani of Japan’s COVID-19 Cluster Taskforce] told me that in Japan, they had noticed the overdispersion characteristics of COVID-19 as early as February, and thus created a strategy focusing mostly on cluster-busting, which tries to prevent one cluster from igniting another. Oshitani said he believes that “the chain of transmission cannot be sustained without a chain of clusters or a megacluster.” Japan thus carried out a cluster-busting approach, including undertaking aggressive backward tracing to uncover clusters. Japan also focused on ventilation, counseling its population to avoid places where the three C’s come together—crowds in closed spaces in close contact, especially if there’s talking or singing . . .

Oshitani contrasts the Japanese strategy, nailing almost every important feature of the pandemic early on, with the Western response, trying to eliminate the disease “one by one” when that’s not necessarily the main way it spreads. Indeed, Japan got its cases down, but kept up its vigilance: When the government started noticing an uptick in community cases, it initiated a state of emergency in April and tried hard to incentivize the kinds of businesses that could lead to super-spreading events, such as theaters, music venues, and sports stadiums, to close down temporarily. Now schools are back in session in person, and even stadiums are open—but without chanting.

It’s not always the restrictiveness of the rules, but whether they target the right dangers. As [one scientist] put it, “Japan’s commitment to ‘cluster-busting’ allowed it to achieve impressive mitigation with judiciously chosen restrictions. Countries that have ignored super-spreading have risked getting the worst of both worlds: burdensome restrictions that fail to achieve substantial mitigation. The U.K.’s recent decision to limit outdoor gatherings to six people while allowing pubs and bars to remain open is just one of many such examples.”

Could we get back to a much more normal life by focusing on limiting the conditions for super-spreading events, aggressively engaging in cluster-busting, and deploying cheap, rapid mass tests—that is, once we get our case numbers down to low enough numbers to carry out such a strategy? Many places with low community transmission could start immediately. . . .