By Elie Dolgin

The GenBank sequence database, the central repository of all publicly available DNA sequences, counted its thousandth complete microbial genome this month. But a thousand genomes is only a small fraction of the diversity that exists in the microscopic world. Now, scientists want to fill in the gaps.

"The broad brush strokes of microbial diversity are not adequately represented in that first thousand," says Stephen Giovannoni, a microbiologist at Oregon State University in Corvallis. "It's absolutely important that we sequence more."

Enter the Genomic Encyclopedia of Bacteria and Archaea, a project spearheaded by the US Department of Energy's Joint Genome Institute (JGI) in Walnut Creek, California, which aims to sequence the genomes of another thousand or so microbes.

The vast majority of microbial species that have had their genomes decoded come from just three groups and were chosen because of their medical or environmental importance. The encyclopedia's researchers are picking microbes from many more branches of the evolutionary tree of life.

Branching out

A pilot project, led by Jonathan Eisen, an evolutionary biologist at the University of California, Davis, demonstrates the ways in which such as approach can pay off. Eisen's team selected and sequenced more than 100 'neglected' species that lacked close relatives among the 1,000 genomes already in GenBank. The researchers reported earlier this year at the JGI's Fourth Annual User Meeting that even mapping the first 56 of these microbes' genomes increased the rate of discovery of new gene and protein families with new biological properties. It also improved the researchers' ability to predict the role of genes with unknown functions in already sequenced organisms.

"There's no doubt to us that filling in the branches of the tree is going to be useful to lots of scientific studies that use genomic data," says Eisen. "There have been four billion years of evolution and we can really benefit from having some of that information in our databases."

All these new genomes should improve researchers' understanding of the evolution, physiology and metabolic capacity of microbes, says Eisen. They will also help match DNA sequences to their proper species from large-scale, high-throughput metagenomic studies from environmental samples, and ultimately contribute in the fields of synthetic biology and genetic engineering.

Microbial madness

The JGI is not alone in sequencing the genomes of new microbes. The next phase of the Human Microbiome Project, an initiative run by the US National Institutes of Health to characterize all the microorganisms living in or on our bodies, involves deciphering 900 microbial genomes of species from five body sites -- skin, mouth, nose, gut and vagina. The Ten Thousand Microbial Genomes Project, launched in August by the Beijing Genomics Institute-Shenzhen, plans to create full genome maps for 10,000 strains -- but fewer species -- of bacteria, archaea, fungi, algae and viruses from a broad range of conventional and extreme environments in China.

And even more ambitious microbial initiatives are in the works. Nikos Kyrpides, head of the JGI's genome-biology programme, is planning an international, five-year effort to sequence all 15,000 known microbes that can live in laboratory culture. He hopes to publish a paper outlining the project's goals in the coming months. "This will completely transform our understanding of microbiology and the microbial planet," he says.