Estimating SARS-CoV-2 infections from deaths, confirmed cases, tests, and random surveys

Nicholas Irons, University of Washington
Adrian Raftery, University of Washington, Seattle

There are multiple sources of data giving information about the number of SARS-CoV-2 infections in the population, but all have major drawbacks, including biases and delayed reporting. Representative random prevalence surveys, the only putatively unbiased source, are sparse in time and space, and the results can come with big delays. Reliable estimates of population prevalence are necessary for understanding the spread of the virus and the effectiveness of mitigation strategies. We develop a simple Bayesian framework to estimate viral prevalence by combining several of the main available data sources. It is based on a discrete-time SIR model with time-varying reproductive parameter. Our model includes likelihood components that incorporate data on COVID deaths, confirmed cases, and tests administered on each day. We anchor our inference with data from random sample testing surveys in Indiana and Ohio. We use the results from these two states to calibrate the model on positive test counts and proceed to estimate the infection fatality rate and the number of new infections on each day in each state in the USA. We estimate the extent to which reported COVID cases have underestimated true infection counts, which was large, especially in the first months of the pandemic.

Keywords: COVID-19, Bayesian methods / estimation, Population projections, forecasts, and estimations, Health and morbidity

See extended abstract.

  Presented in Session 43. Estimating Impacts of COVID-19 on Mortality, Morbidity, and Society