Underrepresentation of nonwhite ethnic groups in scientific research and clinical trials has been a disturbing trend. One particularly troubling aspect is that human genomic databases are heavily skewed toward people of European descent. If left unaddressed, this inherent bias will continue to contribute to uneven success rates in so-called precision medicine.

The problem stems from the underlying structure of science. In the early days of genomics, funding for sequencing projects was often highest among mostly white countries, so those populations are better represented in public databases. Also, some minorities have been historically mistreated by scientists—the Tuskegee syphilis experiment is one glaring example—and many members of those groups can be understandably reluctant to enter studies.

Early studies were also biased by the types of genetic variation the research focused on. Initially scientists looked at only tiny, single-base-pair DNA differences between populations, ignoring larger variations that were more difficult to assess but that turned out to be more significant than anyone expected. These are now known to cause genetic disease and influence the way drugs are metabolized by different ethnic populations, not just individuals—and advanced technologies allow scientists to identify variations that in many cases have never been seen before.

This is an exciting step forward: we are finding that some of these structural differences can explain diseases for which no cause had previously been found—such as Carney complex, a rare disorder that causes tumors to appear in various parts of the body, for example, or a mutation that may contribute to bipolar disorder and schizophrenia. And here, too, the effects may well vary from one ethnic group to another.

I am pleased to say that the genomics community is starting to tackle the challenge of improving the ethnic diversity in our databases. As chief scientist at a DNA-sequencing technology company, I witness these efforts every day. For instance, a number of countries have launched population-specific projects that aim to produce high-quality reference genomes. Excellent results in Korea, China and Japan have led to genomic resources that more accurately capture the natural diversity present in those populations, with positive clinical implications. Such sequences are also enabling large-scale studies of specific ethnic groups to dramatically improve their representation in genomic databases.

Already these projects have led to discoveries that can make clinical trials and medical care more successful for participants with these genetic backgrounds. For example, the Korean genome project found a population-specific variant in a gene that regulates how some medications are metabolized by the body. This is essential information for dosing and for gauging the likelihood that a patient will respond to a particular therapy.

In places with less developed infrastructure, including parts of Latin America and Africa, such efforts have lagged: the National Human Genome Research Institute has begun gathering data from these areas, but sequencing and analysis are usually done elsewhere. Still, as more such projects move forward, there will be important discoveries that will be relevant to any number of ethnic groups. One such program—a National Institutes of Health effort called “All of Us”—aims to sequence a diverse sampling of Americans across gender, sexual orientation, ethnicity and race. Being inclusive is its fundamental goal, and participation is free.

In the field of rare diseases, genome sequencing has proved remarkable at increasing the diagnosis rate, giving answers to patients who might otherwise have gone undiagnosed. Today that approach remains most effective for Caucasian patients because more of their DNA can be interpreted using current genomic data repositories. But as we build up data for people of other ethnicities, we can expect such successes to extend rapidly to patients of any background, which stands to dramatically improve health care for hundreds of millions of people.

Achieving the vision of precision medicine for individuals of any ethnic group requires more diverse representation in the biological repositories that underlie clinical programs. Advanced DNA-sequencing technology is one tool of many needed to help generate better information about people from all ethnicities for the equitable application of those data in clinical practice.