The reason for a mysterious excess of genelike molecules in human cells is finally becoming clear, but in the process is further blurring the notion of what a gene is.

Researchers know that the cells of species such as yeast, flies and humans make far more RNA molecules—copied from DNA—than they seem to need. Now a team has identified three new classes of RNA, along with evidence that genes become more active as these strange RNA molecules become shorter.

Because the newly discovered RNA molecules are born in one gene but may affect another distant gene, "the whole concept of a gene is really up for discussion here," says study leader Thomas Gingeras, a genomics researcher at the Santa Clara, Calif.-based genome technology company Affymetrix.

Cells make proteins by copying genes into RNA molecules, which serve as templates for building proteins. The team wanted to figure out why cells create so much RNA from regions of the genome that do not encode proteins. Some of this RNA consists of short pieces that interfere with the production of genes, but the rest is more mysterious.

To determine the function and destination of this excess RNA, Gingeras and colleagues searched for relatively long RNA molecules in the nucleus or cytoplasm of human cells, and for shorter RNA molecules anywhere in the cell.

They identified a set of long RNA molecules, which they dubbed PALRs (for promoter associated long RNAs), and a set of short ones (PASRs). Each molecule overlapped with the front end of a gene, and these overlaps occurred on nearly half of all protein-coding genes.

The long and short RNA molecules overlapped with each other, as well, which means that the shorter molecules may be fragments of the longer ones, the researchers report in a paper published online today by Science.

In addition, they found that the more active a gene, the more PASRs it had. The same was true for a third class of RNA they identified, molecules of which were short like PASRs but overlapped with the back end of a gene.

What that indicates, Gingeras says, is that the long RNA molecules are being copied from near the front of a gene, where they interfere with the gene's production until they are reduced to smaller pieces. Neighboring genes may influence this process, he says, so "you now begin to blur the boundaries of a where genes start and end."

Genes already had fuzzy borders, but Gingeras says the new concept, if true, "is pointing to a much more elegant and complex structure for a protein-coding gene."