The 1 Percent Genome Solution

Tiny slice of genome reveals bustling activity in the gaps between genes















Share on Tumblr

dna

INTO THE GENOME: Researchers have begun to catalog the functional parts of the human genome. Image: ISTOCKPHOTO/KIRSTY PARGETER

  • What a Plant Knows

    How does a Venus flytrap know when to snap shut? Can it actually feel an insect’s tiny, spindly legs? And how do cherry blossoms know when to bloom? Can they...

    Read More »

The first results from a massive project to exhaustively catalogue all the functions of the human genome reveal a hotbed of activity in the gaps between genes. An international consortium of researchers sifted through 1 percent of the genome looking for pieces of DNA that are copied by the cell or help to control gene activity. The results indicate that most DNA is copied into molecules of RNA, including the long stretches between genes, and that genes overlap and interact with each other much more than researchers previously believed.

"We all suspected there was interesting stuff going on in these regions [between genes], and sure enough there is," says bioinformatician Ewan Birney of the European Bioinformatics Institute near Cambridge, England, a member of the project's computer analysis team.

Although researchers do not yet know the biological significance of these discoveries, they say that fully cataloguing the genome may help them understand how genetic variations affect the risk of contracting diseases such as cancer as well as how humans grow from a single-celled embryo into an adult. The next phase of the project, set to begin later this year, will attempt to inventory the full genome.

A genome consists of only four different nucleotide bases, or DNA subunits, arranged in a particular sequence. The publication of the human genome in 2001 revealed its sequence—the significance of which remains a mystery. In particular, genes account for only 1.2 percent of the genome's three billion bases. Once dismissed as "junk DNA," researchers have found that some of these so-called noncoding regions are shared among mammals, suggesting they play an important function.

To help uncover those functions and identify other important sequences, 35 research groups joined forces in 2003 to create the encyclopedia of DNA elements (ENCODE) project. This consortium selected 44 separate sections of the genome that included regions of high to low gene density and high to low similarity between mouse and human.

Like treasure hunters combing a vast beach with metal detectors, ENCODE researchers sifted through their patch of the genome in multiple ways that are described, along with the results, in a Nature paper published online today and in a special issue of Genome Research.

A major part of the project was identifying sequences that cells copy, or transcribe, into RNA molecules. Cells make proteins from RNA they copy from genes, but some RNAs play roles by themselves. In addition, some studies have found evidence that species from flies and worms to humans copy large amounts of RNA from noncoding DNA, with no apparent purpose. Nevertheless, "before ENCODE, I think a lot of people were skeptical of how real intergenic activity was," says bioinformatician and consortium member Mark Gerstein of Yale University.

Although genes make up only 3 percent of the ENCODE sequence, the consortium found that 93 percent of the sequence is transcribed. Some of the transcripts hail from noncoding DNA, the researchers report, but those that do match up with the 399 ENCODE genes overlap with each other extensively.

Transcripts from 65 percent of the genes incorporate pieces of DNA from relatively far outside of the genes or even from one or two other genes, says molecular biologist and consortium member Tom Gingeras of Affymetrix, a genome technology company in Santa Clara, Calif. Researchers know that cells chop single genes into shorter pieces called exons, which they mix and match into one transcript for creating a protein. Gingeras says the ENCODE findings confirm recent reports that humans and flies sometimes combine exons from two different genes.

Based on the transcript sequences, the researchers identified 1,437 new promoters—short DNA sequences where transcription begins—in or between genes, on top of the 1,730 promoters they knew of. That is nearly ten promoters per gene, Birney says. He adds that the abundance of transcripts that overlap each gene suggests that the very term "gene" should mean something different inside the cell nucleus, where transcription takes place, than outside of it, where finished proteins go.



Comments

Add Comment
Leave this field empty

Add a Comment

You must sign in or register as a ScientificAmerican.com member to submit a comment.
Click one of the buttons below to register using an existing Social Account.

More from Scientific American

See what we're tweeting about

Scientific American Editors

More »

Free Newsletters


Get the best from Scientific American in your inbox

Solve Innovation Challenges

Powered By: Innocentive

  SA Digital
  SA Digital

Science Jobs of the Week

Email this Article

The 1 Percent Genome Solution

X
Scientific American Magazine

Subscribe Today

Save 66% off the cover price and get a free gift!

Learn More >>

X

Please Log In

Forgot: Password

X

Account Linking

Welcome, . Do you have an existing ScientificAmerican.com account?

Yes, please link my existing account with for quick, secure access.



Forgot Password?

No, I would like to create a new account with my profile information.

Create Account
X

Report Abuse

Are you sure?

X

Institutional Access

It has been identified that the institution you are trying to access this article from has institutional site license access to Scientific American on nature.com. To access this article in its entirety through site license access, click below.

Site license access
X

Error

X

Share this Article

X