In the weeks since Americans began practicing social distancing measures, public officials and health experts have turned to coronavirus prediction models for guidance about when normal life can resume.
Models aiming to predict when parts of the country will reach their peak number of COVID-19 cases — and how many deaths and infections will occur in the time leading up to the peak — have been published by researchers and universities across the country. White House officials have referenced prediction models in their response to the outbreak, and the Centers for Disease Control and Prevention cites nine models on its website for clues to when the outbreak will slow.
But the projections of many of the models have changed numerous times since health officials began citing them, causing confusion about the future of COVID-19 and how reliable models are as tools. And predictions are likely to change in Texas, which has begun allowing businesses to reopen — creating the potential for a second wave of infections.
Health experts say the models are extremely complex and shouldn’t be taken at face value. Here’s what you need to know:
TWO KINDS OF MODELS
Researchers are using two types of models to forecast COVID-19.
One is a statistical model, which uses outbreak trends to make predictions. A widely cited model created by the Institute for Health Metrics and Evaluation at the University of Washington is an example. The IHME model, which has been referenced by White House officials in recent weeks, uses data from outbreaks in other parts of the world to predict when U.S. states will peak or use up their health care resources.
The other type is a mechanistic model, which predicts how case outcomes would be affected by certain policy actions. Models that show the effects of certain levels of social distancing on COVID-19 cases and deaths are examples. Columbia University made a model that projects whether, or when, hospitals would be overwhelmed in different U.S. areas based on differing levels of social distancing.
DOZENS OF DATA POINTS
To create a model, scientists plug dozens of different data points into a mathematical equation.
The data includes information about the virus, such as how it spreads and how long people are immune after they recover; information about the community, such as how many people an individual comes into contact with each day; and information about the health care system, such as how many beds and ventilators are available.
Some of this information, such as a community’s hospital capacity and the number of ventilators it has, is certain. The challenge is that some of the data points needed to create projections are not known.
ASSUMPTIONS AND UNCERTAINTY
In a rapidly evolving crisis such as the COVID-19 outbreak, concrete data can be challenging to find. Health experts are still researching how exactly the virus spreads.
One important number that experts still aren’t completely certain of is the virus’ reproductive rate, or the number of new infections that result from each case. An editorial in the New England Journal of Medicine put the rate at 2.2; other health experts have cited numbers between 2 and 4.
The level of immunity someone has after recovering from the virus also is uncertain — while the presence of antibodies indicates some protection, health experts aren’t sure how effective that protection is or how long it lasts. They also aren’t entirely sure how long it takes after infection for a person to be able to infect others.
When experts aren’t sure about a data point, they make an assumption based on the information available. That means that these models have a significant level of uncertainty.
The IHME model, for example, includes a shaded area of uncertainty for each of its predictions, meaning the true number is likely to fall anywhere within that range. Currently, the upper and lower values for new cases in the weeks to come are thousands apart.
Adding to the uncertainty is that there are things the models can’t account for, such as how increased testing and contact tracing will affect the outbreak and to what extent people will follow social distancing guidelines. Also, big jumps in daily cases or deaths in some areas could stem from testing backlogs, skewing the data.
Projections can change over time as more is learned, as has already been seen in models. As more information about the virus and health care systems becomes known, modelers will include that in their equations, which can change projections — but still with some degree of uncertainty.
WHAT ARE THEY GOOD FOR?
The British statistician George Box famously said, “All models are wrong, but some are useful.” Health experts and researchers have pointed to this quote to emphasize that models aren’t meant to precisely predict the future — only help us prepare for it.
“No model is perfect, but most models are somewhat useful,” John Allen Paulos, a professor of math at Temple University, told U.S. News & World Report. “But we can’t confuse the model with reality.”
Health experts say many of the models that include a range of numbers show best- and worst-case scenarios for the weeks ahead. Instead of focusing on whether a prediction is right, health experts say, public officials should prepare as if the worst-case scenario will come true.
“I think it’s key not to get fixated on the exact numbers,” Dominique Heinke, an epidemiologist in Massachusetts, told Vox. “You can look at a range of models and say, ‘We can expect it to be at least this bad.’ ”
If predictions change as more data becomes available, that doesn’t mean a model was “wrong.” That just means researchers have better information about the virus.
And by showing a worst-case scenario, models can motivate people to act.
“Unlike the weather … we actually influence the outcome,” Caitlin Rivers, a professor at the Johns Hopkins Center for Health Security, told Vox. “So people see the numbers, and they are motivated then to be more aware, stay home and using good hygiene and doing all the things that really change that outcome.”