Sunday, April 5, 2020

Data, statistics, epidemics & public policy (2020)

Mortality risk – or more precisely the infection-fatality rate – is often difficult to estimate, especially at the beginning of a disease outbreak. In the case of the current covid epidemic, countries (and the media) report so-called case-fatality ratios (CFR), that is, deaths per diagnosed cases. Reported case-fatality rates suffer from several shortcomings. For a start, testing typically suffers from selection bias. Given limited testing capacity, often only people with mild-to-severe symptoms are tested. Even where large-scale testing is available, this approach will miss infected people with mild or no symptoms. This will bias CFR upward relative to actual risk of death from an infection. Counting fatalities is similarly fraught with difficulties. Some deaths attributable to covid will not be attributed to it because the deceased were not tested. Other deaths will be counted as covid-related because the person had covid even though the actual death was due to complications unrelated to covid. On balance, the CFR published by countries (and US states) are likely to be higher than the actual infection-fatality rate (IRS) given that the undercounting of infections is more significant than the miscounting of deaths. This is what ‘natural experiments’ and the limited number of instances where large-scale covid testing was conducted suggest. 

Case-fatality rates exaggerate the actual risk of dying from a disease due to an undercounting of actual cases. Initial CFRs for corona were put at 3% based on Chinese data. Regardless of whether the Chinese data suffer from misreporting (as alleged by the CIA amongst others), the three-percent figure overstates actual mortality risk. Like virtually all other countries, China focused its testing efforts on people with flu-like symptoms, thus undercounting asymptomatic and mild cases. ‘Natural experiments’ also point towards this conclusion. The testing of all the passengers and crew on a cruise ship revealed that half of all cases were asymptomatic and that the actual IFR was around 1%.  And even this probably overestimates the IFR in the wider population given that an important high-risk group (people > 70 years of age) was overrepresented. Similarly, complete and repeated testing of population of the Italian town of Vo showed that about half of all cases were asymptomatic. This has led several academic studies to infer that that the IFR may be closer to 0.5%. IFR estimates that are consistent with the available data range from 0.05% to 1.0%. By comparison, the estimated IFR of seasonal influenza in the US is 0.1%. 

Absent reliable data, decision-makers were forced to make major policy decisions on the basis of estimates of IFRs that remain uncertain and vary by a factor of up to 20. Even if sufficient testing capacity existed, random sampling would present challenges given the uneven pace of growth in infections across the US (and the world). Different parts of the US (and the world) are presumably at different stages of the epidemic and the epidemic has likely taken different trajectories in different countries (and US states) due to varying local conditions (e.g. population density), policy measures (e.g. social distancing) and behaviorial differences among local populations (e.g. Texas vs Taiwan). Moreover, infections often grow exponentially, making random sampling at given point in time less reliable (and less useful). Sampling is also complicated by the fact that it will fail to count as infections people who have recovered at the time of sampling. (This is just one reason why antibody tests are so important.)

The longer the epidemic goes on, the more the uncertainty attaching to the date will diminish. The problem is of course that the longer one waits for the reliability of data to improve, the further one is likely to be “behind the curve” in terms of successful policy intervention – and this is especially problematic when dealing with a problem characterised by exponential growth. Policymakers (some more than others!) have been seeking to “flatten the curve” (that is, slowing the pace at which infections grow) in order not to overburden the healthcare system that can help save lives. Again, the problem is that at the very beginning of an outbreak, it is very difficult to estimate how many lives adequate healthcare can save. (At the beginning of an outbreak when uncertainty is high, policymakers may be well-advised to err on the side caution.)

Even if more reliable data were available, however, policymakers would need to rely on epidemiological models to assess possible outcomes. Epidemiological models generate scenarios based on assumptions. The better the empirical estimates of the model parameters and the better the model inputs, the better the projection. A significant degree of uncertainty about model parameters typically leads policymaker to rely on different models and/ or various model projections based on different assumptions. Important policy decisions need to be taken on the basis of assumptions that are based on estimates that are in turn based on not very reliable data. As more data becomes available, the estimates improve. In the face of an exponentially growing problem, waiting too long is however not necessarily an option. As an aside, if policy action is taken, it will be difficult to determine ex post whether the baseline scenario would have occurred, absent policy intervention. Counterfactuals are notoriously difficult to evaluate.)

Ex post, the most sensible way to measure risk and cost is to determine ‘excess mortality’. This is the difference between actual deaths during a disease outbreak compared to the expected number of deaths over a certain period. While this approach circumvents to some extent the issue of classifying the correct cause of death, it will need to make adjustment to account for increased or decreased numbers of death due policy intervention and possible second-round effects on mortality. A shelter-in-place policy is bound to reduce deaths due to traffic accidents or gun violence but may increase deaths due to domestic violence or suicide.

Regardless, the epidemic is set to kill significantly more people than seasonal influenza. The estimated IFR of seasonal influenza is about 1/1000 and infects up to 55 million people in the US in a given year, translating into 55,000 deaths, Given that covid is novel and population immunity non-existent, it is reasonable to expect that absent suppression/ mitigation measures, the total number of infections will reach the immunity threshold. The basic transmission rate (R0) absent policy measures is estimated at 2-3. A back-of-the-envelope calculation therefore suggests that 50-66% (herd immunity threshold) of the US population might get infected, translating into 320m x 0.66 x 0.5% > 1 million (US population x herd immunity threshold x IFR). This is at best a rough guess, but it illustrates how sensitive projections are to the empirically estimated parameters like the basic transmission and infection-fatality rates. Getting back to ‘excess mortality’, every year about 3 million Americans die. This means that if none of the 1 million projected corona-related deaths would occur absent corona, the US would suffer a 33% increase of deaths due to covid. Assuming that these projections are in the proverbial ballpark, the question from a public policy perspective is: how many lives can different policy measures save – and what cost in terms of public health and human lives?

The policy responses adopted to fight the epidemic have generally sought to “flatten the curve” by reducing the pace at which the number of infections grows. This is meant to (1) prevent overburdening the health care system, thus making it possible to save more lives, (2) limit the absolute number of infections, leading to fewer deaths, and, if (2) cannot be realised, (3) stretch out the infections over time in the hope of finding effective treatments or a vaccine before the herd immunity threshold is reached. Flattening the curve holds out the promise to prevent preventable deaths. How many deaths can be prevented depends again on estimates and likelihoods attaching to (1)-(3). Moreover, the expected number of lives saved needs be set against the expected lives lost due to policy intervention. The social and economic disruption caused by mitigation and especially suppression policies may lead to increased deaths. Different policies will tend to lead to different trade-offs (e.g. herd immunity approach, risk stratification, mitigation, suppression). It is obvious how much uncertainty attaches the costs and benefits of the various policy options. And this is before the longer-term consequences of the various policies are taken into account.

Ironically, countries that do a good job at suppressing a large-scale outbreak will be at greater risk of a second and even a third wave of infections, unless effective treatments or a vaccine is discovered. While the authorities and the public will be more vigilant and the healthcare system better prepared, countries that fared well initially will be at greater risk later on, compared to countries where infections have reached the herd immunity threshold, whether through a conscious herd immunity policy or mitigation. This may translate longer-term costs, including negative effects on psychological health due to continued uncertainty and related costs in terms of economic well-being due to restrictions on trade, travel etc.

Source: The Lancet

Policy choices hinge on estimates of fatality and mortality risk and the degree to which policy intervention can be expected to lower it. Mortality risk is strongly determined by the number of infections and the infection-fatality rate. These two variables can be influenced by public policies. But policymakers are forced to take impactful decisions on the basis of not very reliable estimates of key parametres. This may lead policymakers to opt for economically and socially and ultimately public health wise costly measures. If fatality is significantly lower than present estimates suggest (possible if not necessarily likely), then the mitigation measures would be more difficult to justify once the public health costs of the second-round effects of these decisions are factored into the equation. And maybe it will turn out that based on the actual value of key variables, a risk stratification could have reduced the number of deaths to levels comparable to a lockdown policy, but at a lower (second-round) costs. While policy decision should be scrutinized carefully once all the facts are in, it would be unfair to forget that crucial policy decisions have to be taken under conditions of heightened uncertainty and immense time pressure in the case of pandemics. Beware of armchair epidemiologists and policymakers who benefit from hindsight!