Since the beginning of the COVID-19 pandemic, daily counts of confirmed cases and deaths have been publicly reported in real-time to control the virus spread. However, substantial undocumented infections have obscured the true prevalence of the virus. A machine learning framework was developed to estimate time courses of actual new COVID-19 cases and current infections in 50 countries and 50 U.S. states from reported test results and deaths, as well as published epidemiological parameters. Severe under-reporting of cases was found to be universal. Our framework projects for countries like Belgium, Brazil, and the U.S. ~10% of the population has been once infected. In the U.S. states like Louisiana, Georgia, and Florida, more than 4% of the population is estimated to be currently infected, as of September 3, 2020, while in New York the fraction is 0.12%. The estimation of the actual fraction of currently infected people is crucial for any definition of public health policies, which up to this point may have been misguided by the reliance on confirmed cases.
ASJC Scopus subject areas