![]() Image: DOE HUMAN GENOME PROGRAM GENES are encoded in DNA by four bases, the letters of the genetic alphabet (A,G,T,C), and can be very difficult to identify. Chromosomes, located in the cell's nucleus, contain the DNA. |
Last summer the world celebrated when scientists from the Human Genome Project, an international consortium of academic research centers, and Celera Genomics, a private U.S. company, both announced that they had finished working drafts of the human genome. It was an important first step toward deciphering the entire genome, one of the greatest scientific undertakings of all time. But these drafts revealed only the beginning of the story¿the scrolls containing the instructions for life. Now both teams have started reading¿gene after gene¿the actual scriptures within the scrolls. Today they will announce the results of their analyses, which will appear in separate papers in this week¿s Nature and Science.
Among other surprises, both papers agree that humans have a mere 26,000 to 40,000 genes¿which is far fewer than many people predicted. For perspective, consider that the simple roundworm Caenorhabditis elegans has 18,000 genes; the fruit fly Drosophila melanogaster, 13,000. As of last summer, some estimated the human genome might include as many as 140,000 genes. It will be several more years before scientists agree on an absolute total, but most are confident that the final number won¿t fall out of the range reported today. "I wouldn¿t be shocked if it was 29,000 or 36,000," says Francis Collins, director of the National Human Research Institute at the NIH. "But I would be shocked if it was 50,000 or 20,000."
An error margin of some 10,000 genes may not seem impressive after so many years of work, but genes¿the actual units of DNA that encode RNA and proteins¿are very difficult to count. For one thing, they are scattered throughout the genome like proverbial needles in a haystack: their coding parts constitute only about 1 to 1.5 percent of the roughly three billion base pairs in the human genome. The coding region of a gene is fragmented into little pieces, called exons, linked by long stretches of noncoding DNA, or introns. Only when messenger RNA is made during a process called transcription are the exons spliced together.
![]() Image: DOE HUMAN GENOME PROGRAM CLUES BY COMPARISON. The mouse genome can help scientists identify human genes because most mouse and human genes are very similar; their sequences are conserved in both genomes. |
To identify functional genes, Collins explains, the scientists had to "depend upon a variety of bits of clues." Some clues come from comparisons with databases of complementary DNAs (cDNAs), which are exact copies of messenger RNAs. So, too, comparisons with the mouse genome help because most mouse and human genes are very similar; their sequences are conserved in both genomes, whereas a lot of the surrounding DNA is not. And when such clues aren¿t available, scientists rely exclusively on gene-predicting computer algorithms.
Because these algorithms are not totally reliable¿sometimes they see a gene where there is none or miss one altogether¿a few scientists doubt the new human gene count. For instance, William Haseltine of Human Genome Sciences¿a company that specializes in finding protein-encoding genes only on the basis of cDNA¿thinks that "the methods that have been used are very crude and inexact." He believes that there are more than twice as many genes as reported thus far by the two groups.
Read Comments (0) | Post a comment 1 2 3 Next >





