Scientists published the first draft of the human genome nearly a decade ago, but the hunt for disease genes is far from over. Most researchers have focused on single changes in DNA base pairs (AT and CG) that cause fatal diseases, such as cystic fibrosis. Such mutations among the genome’s three billion base pairs don’t tell the whole story, however. Recently geneticists have taken a closer look at a genetic aberration previously considered rare: copy number variation (CNV). The genes may be perfectly normal, yet there is a shortage or surplus of DNA sequences that may play a role in diseases that defy straight­forward genetic patterns, such as autism, schizophrenia and Crohn’s disease, the causes of which have stumped researchers for decades.

American geneticist Calvin Bridges discovered copy number variation in 1936, when he noticed that flies that inherit a duplicate copy of a gene called Bar develop very small eyes. Two decades later a French researcher studying human chromosomes under a microscope identified CNV as the cause of Down syndrome: sufferers inherit an extra copy of chromosome 21. By all appearances, CNV was rare and always a direct cause of disease.

In 2004, however, things changed. Two groups of researchers published the first genome-wide CNV maps, which illustrated that variation in gene quantity is actually quite common: each group found about 12 copy number imbalances per person. “When these papers came out, they really turned everything on its head,” says Stephen Scherer, a geneticist at the Hospital for Sick Children in Toronto and a co-author of one of the papers. “People always thought, as did we, that these large changes in DNA were always associated with disease.”

Scherer and his colleagues, who included population geneticist Matthew Hurles of the Wellcome Trust Sanger Institute in Cambridge, England, followed up with a higher-resolution CNV study in 2006, which analyzed DNA from 270 individuals and identified an average of 47 copy number variations per person. And in 2007 researchers sequenced the genome of genetic pioneer J. Craig Venter and found 62 copy number variations. Evidently, Hurles says, “it’s not normal to be walking around with the perfect genome.”

Scientists are still trying to decipher exactly how these variations—most of which are inherited—affect the body. Typically if a genome has three copies of a gene instead of the normal two (one from each parent), a cell will make proteins from all three, producing more than it probably needs. But such gene expression is “not always the case—there are exceptions,” Scherer says. Sometimes cells make the correct amount anyway; other times CNVs affect DNA regions that regulate the expression of still other genes, making the problem more complicated.

Even so, scientists have been able to link CNVs to a handful of complex diseases. A September 2008 study in Nature confirmed earlier findings suggesting that 30 percent of people who have a deleted length of three million base pairs in a region of chromosome 22 suffer from psychiatric conditions such as autism and schizophrenia. A Nature Genetics study from August 2008 found a link between Crohn’s disease and a 20,000 base-pair deletion in a region upstream of a gene called IRGM, which is involved in fighting invasive bacteria.

And in January 2009 another Nature Genetics paper found an association between high body mass index and a 45,000 base-pair deletion in a gene called NEGR1, which affects neuronal growth in the hypothalamus, a brain region that regulates hunger and metabolism. “We’re coming up with so much data, and new kinds of data, that it’s hard to keep up,” remarks Edwin Cook, Jr., a psychiatrist at the University of Illinois at Chicago.

Copy number variation could help explain why complex diseases are often inherited but not always linked to the same genes: they may affect risk in a probabilistic manner, explains Steven McCarroll, a population geneticist at the Massachusetts Institute of Technology and a co-author of the Crohn’s disease study. “The IRGM deletion may increase risk of Crohn’s by only 40 percent, but it does so in millions of people,” he says. Whether a person actually acquires the disease may depend on additional genetic or environmental factors.

As researchers hunt for more links between known CNVs and disease, Scherer and Hurles are scouting out new variants to add to the mix. Their 2006 map identified CNVs only down to 20,000 base pairs; now they are finishing a revised map that includes variants as short as 500 base pairs. The analysis suggests that about 1,000 copy number variations exist in each person, spanning at least 1 percent of the genome.

“We’ve come really far and really fast,” Scherer says. But “over the next year, we’re going to be finding more small CNVs and more common CNVs associated with disease—2009 is going to be a watershed year.”

Note: This article was originally published with the title, "Too Little, Too Much".