Over the past century, many notable viruses have emerged from animals to cause widespread illness and death in people. The list includes the pathogens behind pandemic influenza, Ebola, Zika, West Nile fever, SARS and now COVID, brought on by the virus SARS-CoV-2. For all of these microbes, the animal species that served as the original source of spillover was hard to find. And for many, that source still has not been conclusively identified. Confirming the circumstances and key participants involved in the early emergence of an infectious disease is a holy grail of this type of scientific inquiry: difficult to track and even more difficult to prove.

In ideal conditions, the first human cases involved in a zoonotic disease spillover (when a pathogen jumps from animals to humans) are reported in connection to animals present at the time of the event. This happens when the cluster of cases is large enough to be investigated and reported. But it is not necessarily the first time spillover occurred. Most spillovers are limited to more narrow animal-to-human cases. Once pathogens start to spread by human-to-human transmission, the tracks leading back to the initial animal source grow faint and become nearly impossible to follow.

Thus, animal sources for viruses that cause pandemics often remain shrouded in mystery. For some viruses, animal sources have been implicated after years or decades of large-scale international investigations. For other viruses, animal sources are highly suspected, but enough evidence has yet to be produced to pinpoint an exact species or range of species. Typically, lines of evidence are drawn over time through a trove of peer-reviewed publications, each building on the research that came before it, using more precise methods to narrow the field of possible sources. The scientific process is naturally self-correcting. Often seemingly contradictory hypotheses can initially flood the field, especially for high-impact outbreaks. But eventually, some of them are ruled out, and lines of investigation are narrowed.

Frequently, this investigative research only points to a group of suspected species, possibly a few most likely genera or, more often, an entire taxonomic order. That is because the virus has not actually been found in the suspected animal source in such cases. The evidence instead revolves around closely related viruses or their most recent common ancestors, based on inferred evolutionary history. If a virus was found in animal samples after the same pathogen caused widespread transmission among humans, it is possible that the virus spilled from humans back into animals. That happens often enough with viruses that can infect a range of animal species that the possibility needs to be presumed until it is ruled out.

The best way to rule out such spillback is to examine archives of specimens that were collected and stored prior to the initial outbreak. For these retrospective studies to work well, the specimens need to be the ideal type of samples, and they must come from the correct species and be stored in a way that allows scientists to recover the virus of interest.

Most viruses of interest typically infect animal hosts for only a matter of days. Detection of viruses that cause pandemics thus require sample sizes that are orders of magnitude higher than what is needed to detect endemic diseases or viruses that are long-lived in their host. One could get lucky, but rigor in scientific inquiry demands large sample sizes to power these types of analyses.

Investigations into an animal source that immediately follow a viral emergence event have an additional challenge. Because an outbreak in animals likely would have preceded the outbreak in humans, infections in animals would have already peaked. Few or none of them would still be infected. Immediately postoutbreak, the probability of identifying infection in live animals could be especially low, thus requiring even larger sample sizes. In China, it is not surprising that scientists did not find SARS-CoV-2 in potential animal sources immediately after the human outbreak in Wuhan. Nor does that result indicate there is a problem with the wildlife spillover theory. This is a difficult search that takes time.

Immunologic evidence of previous infection can be detected in a possible animal host in the form of antibodies, but new serological assays must be developed for a new virus. At best, this type of evidence is nondefinitive—and at worst, it leads us in the wrong direction in the hunt. Antibody responses to viruses are notoriously cross-reactive: the serological assays will react in the same way to related viruses, both known and as yet unrecognized. These assays must be evaluated and validated in every species, and there is no gold standard test for a new virus in a new animal. Any efforts to apply new tests to animals would need to be verified with repeated testing and supporting data.

As the scope of investigations broaden, other challenges must be met. Which species should be prioritized? Which locations should be investigated? Heading down the wrong path leads nowhere and wastes valuable time. Viral infections in animal populations are notoriously unpredictable, governed by dynamics that can only be uncovered with in-depth longitudinal studies after a virus has been found.

That brings us to the speed at which science works. Transdisciplinary collaborative research to investigate a novel virus takes extra time: detection techniques must be tailored to the new pathogen and customized to answer an array of research questions. Scientists are cautious about overinterpreting data and making unwarranted assumptions. And in the midst of a pandemic, understanding origins might not be the most pressing issue. During COVID, many scientists have pivoted to research that might help save lives this year—by modeling the trajectory of spread, characterizing SARS-CoV-2 variants and investigating the chances that the virus could spill back into different animals that serve as a new viral reservoir, ultimately threatening people again.

Timely exploration of the source of SARS-CoV-2 is important, but future pandemic preparedness requires a deep understanding of the mechanisms involved in the emergence of a much wider array of viruses with pandemic potential. With such knowledge, we will have better than a few vague and scattered clues the next time a novel disease emerges.