NEWS AND VIEWS

Unmasking the limitations of statistics in the early age of coronavirus

4/24/2021

Public health advice and directives are required to be evidence-based, but also are urgently needed in real time, during pandemics. Statements, often used to evaluate policy options, have been made about the SARS-CoV-2’s testing, origin, infectiousness, contagiousness, severity, and evolving variants, as well as the many ways to combat the virus’ impacts through population- to individual-level approaches such as mask wearing, social distancing, vaccination, convalescent plasma treatment, antivirals and steroids. It is useful to review the inherent limitations of commonplace statistics and analyses and approaches in the ongoing fight against SARS-CoV-2. This unmasking is not meant to debunk myths, but rather to highlight the nuanced world we live in with respect to testing for epidemiological outcomes of interventions or ascribing treatments of Covid-19 as efficacious or effective. Below, I share three of the best treatments about the limitations of statistics in this early age of coronavirus.

Testing for the virus: sensitivity and specificity

A first step in dealing with a contagious and severe disease is knowing which individuals are infected, so as to isolate them and quarantine their contacts. Let’s say 1% of 100000 tested are actually infected, but 95% of the time our test returns a positive test result for an infected individual. Then, ~50 out of 1000 infected persons score as being uninfected or are false negatives. For a virus with superspreading potential, it would not be very useful to tell 50 people that they are uninfected when, in fact, they are infected. We also can consider the specificity of the test or the likelihood of returning false positives at 5%. Here, ~4950 of 99000 individuals test positive for the virus when, in fact, they were uninfected. Our estimate of % of individuals infected is therefore 950+4950 or ca. 5.9% -- well above the actual 1% infected. The main point here, which can be shown with additional math, is that the estimate of % infected upon which case rates are based depends on the test’s sensitivity and specificity, but also on the background level of infection —which we do not actually know. These factors likely vary from test to test, place to place and time to time. Estimating the % infected depends on our combining tests using various methods, retesting often, and assessing clinical symptoms. Using those data in analyses depends on our being clear about how the outcomes were derived (see Waltner-Toews 2020 for a treatment of this and related problems during pandemics).1

Vaccine efficacy and effectiveness

Vaccine efficacy rates and vaccine effectiveness are different, as discussed in the excellent backgrounder, posted in Vox. The problem is that vaccine efficacy rates might not reflect their effectiveness. Vaccine efficacy rates are derived from comparing the newly infected cases ascribed to the vaccinated groupvs. ascribed to the placebo group. Say, for example, 100 people got infected after a period of time involving following thousands of vaccinated and placebo participants in a clinical trial. If each group had the same numbers of participants ‘treated’ at the outset and if 75 of the infected individuals were in placebo group and the remaining 25 infected individuals in the vaccinated group, then non-vaccinated participants were 3X more likely to get Covid-19. Vaccine efficacy is context (or trial) dependent because these statistics depend on when and where trials were conducted. Let’s say background case rates are high, possibly due to a highly infectious or highly transmissible variant of concern. In this scenario, even vaccinated individuals might become infected often, possibly before vaccines take effect or from multiple exposures. Such a scenario would depress estimates of vaccine efficacy rates with respect to infection likelihood. From an individual health standpoint, however, it is much more important to address the vaccine’s effectiveness— how the infected individuals fared in each group with respect to having mild to moderate symptoms vs. individuals experiencing severe symptoms of the virus and requiring hospitalization. This comparison is less likely to be context dependent as only severe cases are likely to be hospitalized across districts. So, comparing rates of hospitalizations among infected individuals from vaccinated and placebo groups can be a rigorous way to assess the protection against disease severity as afforded by vaccines. The different vaccines are currently all near completely effective in protecting against severe Covid-19 disease.

The (false) promise of ecologic (or aggregate) studies

Ecologic studies were rampant during the early days of Covid-19 and are still ongoing. Basically, these types of studies rely on comparisons of populations (often aggregated at the level of country) that vary in Covid-19 case or mortality rates and which also vary in key attributes. The thinking is that factors that explain statistical variation in Covid-19 case or mortality rates at the aggregate level might also be important at the individual level and might lead to beneficial interventions. Ecologic studies are best thought of as hypothesis-generating exercises and, at worse, a complete waste of a limited investment of time by researchers. But the results of ecologic studies can be appealing and do take up media bandwidth. There is a brief “Skeptic’s guide to ecologic studies during a pandemic” posted roughly a year ago in Forbes magazine that does an excellent job of outlining the limitations of ecologic studies in general and, more specifically, with respect to Covid-19. This post has heaps of wisdom in it and should be read by everyone interested in epidemiology and global health. One of the inherent problems with ecologic studies is that chasing leads can result in opportunity costs, as noted in the Skeptic’s guide. There might be promise with ecologic studies, but their utility will depend on a rigorous dose of epidemiological thinking at the outset, careful consideration of potential confounds and precision of outcome measures, and a healthy reticence to just hack through minefields of big data.

Considering these three broad examples and their sources: testing for the virus, testing for vaccine efficacy and effectiveness, and ecologic testing of potential determinants of Covid-19 case and mortality rates, should be sufficient to introduce the limitations of statistics. Whether or not the statistical approaches used to address Covid-19 related phenomena are worthy of attention (or worthless) depends on knowing the context in which they were generated and ultimately cited. The above sources have all implied or stated explicitly that it is crucial to get the science and analysis right, especially during crises. To achieve this, we have to be neither ignorant of, nor apathetic toward, the statistics, but embrace them for what they can tell us.
------
1, Waltner-Toews, D. 2020. On pandemics: deadly diseases from bubonic plague to coronavirus. Greystone Books Ltd. Vancouver, 262 pp.

Author

Archives

Categories