DNA Sudoku

Researchers get help from a venerable number theory and a popular puzzle game to solve genetic medical mysteries

Join Our Community of Science Lovers!

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

A 2,000-year-old math theorem, along with Sudoku, may soon help researchers untangle DNA at blazing speeds.

Hunting for a particular genetic mutation in hundreds of thousands of specimens can be an expensive and time-consuming process. In the past several years, faster multiplex DNA sequencing machines have sped up the acquisition of data, but researchers have still been hobbled by having to label each sample with a unique molecular identifier (or bar code) for analysis.

Scientists at Cold Spring Harbor Laboratory (CSHL) in Long Island, N.Y., are proposing a new take on a very old idea to tackle large data sets simultaneously. The team is applying the Chinese remainder theorem to pinpoint single samples in larger pools, which are arranged in rows and columns.

Invented about 2,000 years ago, the theorem is a method for mapping information using prime and co-prime numbers. In the case of DNA sequencing and Sudoku, the theorem is used to organize data points with coordinates in a box, but it can also be used to figure out all sorts of missing information in other domains, such as distant points sensed with high-speed radar, pieces of code, and who that attractive person was that you saw at three out of seven parties on a cruise ship.

By using the idea, researchers can deal with whole libraries of genetic information instead of looking at just "one genetic sequence at a time," says Yaniv Erlich, the lead author of the paper, published as the cover story of this month's Genome Research.

In Sudoku players must fill every row and column each with all nine numerals, but in applying this to so many genetic samples to search, the researchers call on state-of-the-art robots, machines and programs to do the specimen placing and searching for them. "Every cell in a Sudoku [puzzle] is like a specimen, and every digit is like a genotype," says Erlich, a doctoral student who had used the Chinese remainder theorem in previous work with radar. He brought the idea to the attention of his CSHL professor Greg Hannon.

The process allows researchers to pool dozens of samples and assign the pool—rather than individual samples—with a bar-code identifier. After the sequencing machine returns results from a whole pool, a decoder program can use the theorem to work backward and locate a particular specimen. To find a mutation in a cystic fibrosis study, for example, the decoding program would use each pool's results as the constraints to pinpoint the location of the mutated specimen.

"Think about Sudoku as a pooling theory," he says. "You have a constraint in a row and column [to] have all nine digits. We have the same thing—maybe not as neat—but we have all the sequences in the same pool." From there, he explains, a program can go back and use the same logic to find the mutant DNA.

In the future, sequencing and analysis that would have taken months and $10 million could require just a few days of machine time and $50,000 to $80,000, the study authors note. All thanks to ancient Chinese number logic and a popular pen-and-paper puzzle game—which Erlich now plays regularly.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American