When we look back at the COVID pandemic, what will hindsight tell us? Will we remember the turn of the decade as the year that finally brought real change to pandemic preparedness, or will our eventual return to “normal” stymie our progress?

Although epidemiologists have long warned about the potential for global pandemics, their admonitions have largely gone unheeded. However, industrialized animal farming practices, increased human-animal contact, globalization, decreasing biodiversity and other factors all point to the likelihood of another zoonotic disease (one transmitted from animals to humans) with pandemic potential .

A slim silver lining of the current COVID-19 pandemic is that it can help us better prepare for future outbreaks—if we harness what we’ve learned correctly. In particular, we can better leverage one of the most crucial resources we have when it comes to pandemic preparedness: real-world data.


The pandemic has created a trove of data that can help us plan for future disease outbreaks. The abundance of research on the U.S. pandemic response provides insight into the benefits and consequences of various courses of action, and we can leverage this knowledge for future response.

One of the main takeaways is the need for the health care system to have real-time visibility. While observers have stated again and again that the ineffective roll-out of testing was (and still is) one of the U.S.’s biggest failings in getting ahead of COVID-19, there is a wealth of other data that can offer insight into the virus’ spread. We need to improve in collecting, sharing and analyzing this real-world data so we can rapidly recognize COVID-19 symptoms, identify effective treatments and more quickly track the spread.

For example, when the pandemic began, information disseminated by public health organizations identified sore throat, shortness of breath, cough and fever as symptoms. However, months later  additional symptoms like rashes and skin discoloration—such as on the toes and feet—were recognized as potential indicators of the virus. Additionally, what has been termed “silent hypoxia”—COVID-19 causing critically low blood-oxygen levels without any noticeable external effects on breathing—killed many patients before doctors knew to be on the lookout for it.

Why weren’t we able to recognize these symptoms sooner? The electronic health records (EHRs) in which physicians document patient visits do not allow for an easy, effective way for data to be shared at scale. If de-identified patient data could be mined at a national level, artificial intelligence and machine-learning algorithms could have identified patterns far faster than it took isolated researchers working with small patient pools. Instead of examining COVID-19 data holistically, within six months, researchers had published over 23,500 papers—a wealth of information, but too much data for any one individual to possibly parse through and identify the valuable studies.  

Centralizing data access could have not only sped the identification of COVID symptoms but also allowed for rapid studies of effective treatments. Researchers could use a truly robust database to analyze and identify which treatments are most effective for patients with various underlying conditions or disease histories.

Furthermore, using machine-learning techniques within a shared database could generate predictive insights, showing the patterns in communities that precede outbreaks and helping dictate where and when lockdowns and social distancing orders should be implemented. Several countries are already using unconventional data sources, like de-identified cell phone and fitness tracking data, to predict COVID outbreaks. For example, Germany is using de-identified tracking apps to identify anomalies in day-to-day habits, such as regularly active users skipping exercising or walks to predict when a community is likely about to experience an outbreak—and prevent it before it worsens.

The experience of Israel offers a great example of how real-world data can be analyzed and shared. By swiftly rolling out the Pfizer vaccine to more than half its population and tracking the results, the country was able to demonstrate a dramatic decrease in serious infections and hospitalizations as a consequence of the vaccine. This real-world evidence is key to understanding how the vaccine works outside the confines of controlled clinical trials and in much larger populations.

These measures represent just the basics of what policy makers can do to inform real-time insights. And the benefits need not be used just for pandemic preparedness; mining and analyzing de-identified data could be used to identify effective strategies for fighting any number of conditions, from mental health concerns to chronic illnesses.


When the next novel virus with pandemic potential inevitably arises, the changes and preparations we put in place in the coming months and years will determine if we can better manage another crisis on the scale of COVID-19. Urgent action must be taken, as our health care system continues to experience breakdowns of data sharing at every level. While COVID-19 testing has increased dramatically, organizations still struggle to share test results, as some facilities still rely on fax machines to communicate timely information. When fighting an ongoing pandemic, results delivered weeks after testing serve little purpose in preventing the spread of disease. We must enable real-time insight and recognize the importance of studying past events if we are to have the foresight to prevent the next pandemic.

While some countries, like the U.K., have dedicated significant resources to sequencing additional COVID-19 genomes, the U.S. is 32nd in the world for the number of sequences completed per 1,000 COVID cases. The inability not only to identify the mutated virus, but also to easily recognize any significant shifts in virus epidemiology at that level, will continue to hamper our ability to predict and prevent spread. 

While there will always be differing opinions on the best course of action for pandemic preparedness and prevention, we need to create a more effective forum for discussion and must continue to encourage discourse between many disciplines to weigh the potential social, economic and physiological ramifications of various courses. These discussions should not wait until the next pandemic arrives. Instead, we must establish and fully fund think tanks and committees to imagine possible scenarios and responses.

We should seek to answer important questions, such as: How long can businesses of various socioeconomic levels survive closures, and what type of aid is most effective? What are the long-term implications of a child missing a year of school or attending school virtually? How does isolation affect mental health among people of different age groups, income levels and urban and rural settings; and what strategies work to mitigate these effects? What lessons can be learned from countries with sophisticated data capture systems?

With the right data to analyze—and the right experts to analyze such data—we are fully capable of answering these questions and gathering the necessary insights to understand the ongoing impact of COVID-19. Armed with this knowledge and with a global recognition of the consequences of an ineffective response, we’ll have the motivation and means to take the appropriate precautions and prevent a future pandemic before it starts.

This is an opinion and analysis article.