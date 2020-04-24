The COVIDcast website, brought together by Carnegie Mellon’s Delphi group machine learning learners, combines several self-created data points with the goal of a “nowcast”, a real-time image of the disease that will be used to create new forecasts. .

Correction: Quidel provides evidence of influenza, not COVID-19, as stated in an earlier version of this article. The Delphi group expects it to start airing now in a few weeks.

Doctors, researchers, and governments never have the full vision of an outbreak at any given time. In fact, its full extent is perhaps unknown, as scientists have pointed out.

And the extent of society’s lack of knowledge comes to light in the scrutiny of COVID-19 models.

Daily presentations by New York Governor Andrew Cuomo reveal uncertainty in the multiple models of the disease. Popular media outlets have pointed out where models are wrong, including “hidden” outbreaks that are spreading in U.S. cities and much higher case rates than official ones.

The problem is how COVID-19 has been estimated and the predictions that have been created are a fairly compelling tool based on an epidemiological forecasting approach created 100 years ago.

It is about to change.

Scientists at Carnegie Mellon University, one of two “centers of excellence” for influenza research (the other is UMass in Amherst), presented a combination of five symptom maps that were reported in all over the country for people who feel something they could. be COVID-19, although it could be something else.

You can see the five maps on COVIDcast, the CMU website created by the laboratory that manages the experiment, Delphi, led by professors Roni Rosenfeld and Ryan Tibshirani, both of the CMU machine learning department. Tibshirani also has an appointment with the Statistics Department of the CMU.

This symptom map combined with COVIDcast is an example of digital surveillance with real-time information to track the onset of the disease and see its subtleties and nuances on a local scale. Sources include surveys conducted by Facebook users, users of Google Opinion Rewards, and Quidel Corp (a medical test maker), which records when people order a flu test, which may indicate that a person is experiencing symptoms similar to those of COVID-19 (FEMALE)

The collection of this data in real time will eventually lead to what is called “nowcasting”, a practice that has evolved over the last decade as a way to avoid the slow pace of accumulation of epidemiological data. The Delphi COVID-19 team has been refining nowcasting for eight years to predict seasonal flu on behalf of disease control centers.

His approach to the flu was based on relevant cases from health care providers. This data, labeled ILINet, is one week old, which means that patients seen by doctors are not reported until Friday of the following week.

The Delphi team developed the current broadcast as a way to merge real-time data, such as Google search on how many people are looking for flu symptoms. With a statistical approach, they amplify what medical reports say by folding what real-time reports say.

The Delphi group is now building on this flu experience to create a new predictive approach for COVID-19. Using digital surveillance data from voluntary reports and the sensor fusion approach to combine multiple data sources, they will begin providing disease predictions in the coming weeks from the current picture of things. Once the forecasts begin, Delphi intends to add the current COVID-19 transmission.

The COVID-19 effort is not simply an extension of the flu forecast. New statistical approaches are required, because COVID-19 is not the same as seasonal flu.

“What we observe now is not similar to what we have observed with the historical flu,” Tibshirani remarked during a presentation at the COVID-19 and AI conference on April 1, sponsored by the Stanford University Institute for Humanity. “By definition of a pandemic, (COVID-19 is nothing like what we’ve observed, period.”

To statistically analyze the mass of surveillance data, the Delphi team has for many years refined a series of statistical and machine learning approaches. They include something called delta density, which incorporates approaches based on “Markov Processes,” with which a certain state of affairs can be inferred from the states of previous things. Subsequent data can be used retrospectively to review previous assumptions about the states of the issue, thus continually refining a disease model.

There is a certain conception of how the different signals are combined, an ability to know how to manage and consider data, as Tibshirani explained to ZDNet in an email.

“These signals do not measure the same,” Tibshirani wrote, “they are not even drawn with respect to the same population (Facebook and Google users).”

The Delphi group does not view survey signals as “fundamental truth,” a statistical phrase that means objectively true. Rather, “it is the individual temporal trends that interest us most,” Tibshirani wrote.

“For example, if in a given county we see both signals increase, this may be a significant indicator of increased COVID-19 activity. This is implicit in that their individual biases are constant over time, which it’s a reasonable assumption. “

To understand the importance of the current transmission effort and the predictions that will result from it, consider the limitations of current models, including models from Imperial College London and Columbia University.

All of these models are what are known as “mechanistic” models because they are based on a very general understanding of the mechanism by which all infectious diseases spread, the “transmission dynamics,” as it was sometimes referred to.

The mechanistic models in use today are mostly derived from a mathematical approach, a model called the compartmental model. The most familiar form of the compartmental model is the so-called susceptible, infectious, recovered, or SIR mathematical model. The SIR was first introduced in 1927 by scientists William Ogilvy Kermack and Anderson Gray McKendrick.

The SIR model, and the many variants used today, such as SEIR, which includes “exposed” people, are sets of equations in which one connects values ​​for variables such as the number of people currently infected, for such to come. with a compelling curve for the progression of a disease. These models have come to dominate people’s thinking about the spread of the disease. The Reich Lab at UMass, the other center of excellence, has brilliantly combined models to show people the difference in their predictions.

These SIR-derived models have proven their worth for decades, but have limitations. The brightest limitation is the reliance of these models on data from official sources, such as the number of confirmed cases. The other big limitation is that the models are quite rigid, making similar assumptions about the mechanism by which the disease spreads.

The Delphi COVIDcast group’s statistical approach is likely to show the nuances of COVID-19. We already know that the disease affects people in very different ways depending on age and pre-existing chronic conditions such as obesity. We also know that there is genetic variation in the disease, mixing different strains in the infected population, with different proteins giving the disease an antigenic signature that can vary over time.

As more and more data is incorporated as signals, more of the specificity of how COVID-19s are disseminated will come to light. Instead of looking at a single disease that spreads uniformly throughout the population, it is conceivable that doctors and governments should consider a complex of conditions that need to be addressed in different ways.

Epidemiology will learn about the disease surveillance of this pandemic, as a result of digital surveillance and telephone transmission on the largest scale ever attempted. The practice of monitoring and fighting diseases may not be the same again.