Project

Description
The Centers of Disease Control and Prevention is hosting collaborative forecasting projects related to predicting the coronavirus disease spread, and anticipate the mortality and number of hospitalizations and flu-like-symptoms caused by the disease across the country. This critically timed project comprises a handful of teams of leading data scientists, epidemiologists, statisticians, and high-performance computing researchers from national laboratories, public universities, public health institutions, and some private sector agents. We are leading two teams, one for predicting ILI burden and another for COVID mortality and hospitalizations. Our team is using deep learning models to forecast specific targets at the national, regional, state, and local levels. In addition to CDC ILI and COVID data, we are incorporating many other real-time datasets such as syndromic surveillance data and point-of-care data from major providers. We combine these datasets with domain knowledge using end-to-end deep learning models to predict targets on a weekly basis. The CDC synthesizes our weekly and monthly predictions with other models to help determine policy and other planning decisions to help communities prepare for and fight the disease. We have extensive experience with disease surveillance - we have been leading a team at the CDC FluSight challenge since 2018 (our model EpiDeep had the best performance in the HHS1 region) and have published multiple research papers on incidence prediction at major venues (see related publications below).

Results
We show two important changes in trend that our model DeepCOVID predicted with several weeks of anticipation: second peak in US national incidence mortality (left) and uptrend in CA incidence mortality (right); both of the pictures above use the JHU dataset as ground truth. Also see our latest results on mortality prediction on the CDC website and FiveThirtyEight (1 out of 11 selected models).