Viruses do not make good fossils. But advances in genomic technology have allowed scientists to peer into the genetic material of viruses and their hosts to search for clues about their shared evolutionary history.

Genetic code from retroviruses has been found to compose some 8 percent of the human genome, having been copied in during replication and left to be inherited by us and our progeny. But non-retroviral RNA viruses do not use their host's DNA to replicate—and some do not even enter the host cell's nucleus. Nevertheless, new research has turned up surprising evidence that some of these viruses are enmeshed in the genomes of vertebrates—including humans and other mammals.

One of these new studies, published online July 29 in PLoS Pathogens, has uncovered some 80 examples of viral genetic data circulating in the genomes of vertebrate species for the past 40 million years.

To discover these connections, the group ran computer analyses of 5,666 genes from all known non-retroviral, single-stranded RNA virus families against the genomes of 48 vertebrate species. The strongest matches belonged to just two virus groups: Bornaviruses and filoviruses, the latter of which includes the deadly Ebola and Marburg hemorrhagic fever pathogens.

Another recent paper, published January 2010 in Nature, found bornavirus genes in the human genome. (Scientific American is part of Nature Publishing Group.)

Previous research had located evidence of viral fragments in the genomes of plants and insects, but in the past year new findings of these code segments in vertebrates surprised many biologists. "Retroviruses are an enormous fraction of the human genome, but that was a little understandable because the viruses have to inject their material into the DNA to survive," says Anna Marie Skalka, basic science director emeritus at Fox Chase Cancer Center in Philadelphia and co-author of the PLoS Pathogens paper. Otherwise, errant genetic material from viruses that are not retroviruses can find its way into the genome of germ line cells during the RNA copy process. That material can then get spliced into the genome by long interspersed repetitive elements (LINE) that are usually busy copying their own RNA.

When these infrequent flubs happen, they can be beneficial, harmful or neutral, Skalka explains. "There are LINE integrations that cause cancer or you could look at them as providing fodder for evolution—we have more sequences in there that can evolve and eventually make other genes."

Ancient code
Many viruses can undergo incredibly rapid adaptation, eluding immune systems that learn to recognize previous strains. But some researchers are pointing to emerging data on these viral "fossils" as indication that many viruses are in fact ancient—and have changed little since their material was integrated into host genomes.

"Previously there was no way to get an idea of how old they were," Skalka says of these viruses. But like tracing the evolutionary history of other organisms, genomic analysis can give scientists a new way to assess these estimates.

The new findings support a theory of Tufts University School of Medicine molecular biologist John Coffin that viruses in the bornavirus and filovirus groups "are really very old—despite the fact that card-carrying evolutionary biologists have concluded that they must be evolving very rapidly and probably [are] very recent," he says. "That is clearly very wrong." He cites this and other recent work to conclude that some of these hemorrhagic fever viruses have changed little in the course of primate evolution.

This time differential is both exciting to researchers who have been hunting for ancient evidence of the viruses' evolution and challenging to those who are trying to find these viral fossils in contemporary genetic codes.

"It is a big challenge" to match genes from modern viruses with those that might have entered animal genomes millions of years ago," Skalka says. "These RNA viruses evolve very, very rapidly and they change very, very rapidly—so the probability that you could find something that existed 40 million years ago could be very low."

Even though the recent assays have found several firm viral matches, many researchers assume there is likely much more virus material hiding out in our genomes. "All viruses make messenger RNAs, so it seems very possible that many others could have been picked up by LINEs and worked in," Skalka says. The others might be harder to find, however. It is possible that "they've evolved so rapidly that we don't recognize them any more."

Vetted viruses
Because these pieces have been present in vertebrate genomes for some 40 million years, "there might be some selective advantage to having them," Skalka says.For bornaviruses and filoviruses in particular, she notes, "there must be something special about these viruses," to have kept them around for so long.

Skalka speculates that these two virus groups might also be evolving more slowly. One reason for that might be because they have found a happy equilibrium with a reservoir species—such as bats—that has fostered the relative stasis.

Such a theory is "very tantalizing," Coffin says. "I think it makes perfectly good sense [but] it obviously requires some experimental verification."

Other possibilities for their more frequent appearance in vertebrate genomes are that they might have a special relationship with germ line cells or have RNA that is more recognized by LINE elements—and thus more prone to get copied and spliced into the host's genetic code.

Genetic immunity
Although strains of hemorrhagic fever can be fatal for many humans and animals, these viral genetics might also be conferring some protection on their hosts. Skalka explains that RNA from these integrated viral sequences could bind with RNA of the incoming virus and destroy it or that proteins from these code segments could be similar, albeit different enough to intruding viruses to "muck up the whole replication cycle," she says.

Some of the next steps will be to try to find more of these viral fossils in animal genomes. And as more genomes are sequenced and analysis tools become even more efficient, Coffin expects that "things that are older—and thus more diverged—will become easier to find."

But the real trick will be trying to figure out just what these genetic relics are doing in the genomes. "In the case of retroviruses" in the genome, "they have conferred benefits that have nothing to do with viruses," Coffin says, noting one retroviral gene that has been found to help with placental growth. "They are just genes that the host has found useful for one function or another."

Skalka and her team hope to uncover if the same is true for these non-retroviral genes, she says. "We would like to know what the significance is in human beings."