The math mystery that connects Sudoku, flight schedules and protein folding

Thousands of notoriously difficult problems in computer science are actually the same problem in disguise

Playing a wood Sudoku game board with puzzle in progress in stock photo — With NP-complete problems, you could discover a fast algorithm to solve Sudoku puzzles that could also break the encryption schemes that protect our digital economy.

Natalia Barliaeva/Getty Images

Sudoku fan? After diving into the math behind the game, test your skills with our very own puzzles in SciAm Games!

Computer science seemingly rides a curve of unstoppable progress. Mere decades took us from vacuum tubes to microchips, from dial-up to high-speed Internet, and from Office Assistant Clippy to ChatGPT. Yet thousands of everyday problems across science and industry remain just as unsolvable as ever for today’s fleet of supercomputers powered by artificial intelligence.

People working on these notoriously hard “NP-complete” problems could win a million-dollar prize, awarded by the nonprofit Clay Mathematics Institute, for either finding their fast solution or proving that none exists. An amazing insight from the 1970s makes this challenge even more tantalizing: these 1,000-plus problems are, in a deep sense, one and the same. If you solve one, you solve them all. This concept, now fundamental in the field of theoretical computer science, shows that certain groups of computational problems form a unified web. Discover a fast algorithm that solves Sudoku puzzles of any size, and you can now break the encryption schemes that protect our digital economy. Reveal a shortcut for scheduling a flight tour within a budget, and you can use it to solve nearly any famous open math problem.

Finding fast algorithms for these NP-complete problems (or proving that no such algorithms exist) would resolve the “P versus NP” question, which is the most important mystery in computer science. P refers to the set of computational problems that computers can solve efficiently. Set NP, meanwhile, contains the problems whose solutions can be verified efficiently—but these problems can’t necessarily be solved quickly. NP includes everything in P (because finding a solution is a perfectly good way to verify that solution), as well as harder problems for which we don’t know efficient methods for finding solutions. We can verify them only once they are solved. The P-versus-NP question asks whether this apparent asymmetry between finding a solution and verifying one is real or illusory. Maybe P = NP, and they refer to the same set of problems. In other words, maybe the NP problems that we don’t know how to solve efficiently only appear hard relative to P problems because we have yet to find the right insights.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

For example, given a budget, any large list of cities and the connecting flights between those places, is there an algorithm (a recipe of simple instructions) that would efficiently decide whether you can visit all the cities while respecting the budget? We don’t know. We do know an inefficient algorithm: check every possible sequence of flights that takes you to all the cities, add up the cost of each, and compare the totals with the budget. But as the number of cities on the list grows, the number of routes to check explodes exponentially, quickly growing infeasible even for the fastest computers. There may or may not be some clever shortcut that circumvents this exhaustive search, but computer scientists have yet to find one. Given a solution, however—in this case, a proposed list of flights—one could verify in a reasonable amount of time whether a route hits every city and stays under budget. If P equals NP, the flight scenario (an example of what is called the traveling salesperson problem) presumably has a speedy solution. We just don’t know it yet.

Many natural computational problems join the traveling salesperson problem in the NP set. This includes challenges from logistics (such as packing boxes into trucks), social networks (finding cliques of mutual friends), biology (predicting how proteins will fold), and games such as Sudoku, Pokémon and Candy Crush. We can even cast math itself as an NP problem because its proofs can be verified efficiently. It may seem strange to classify these as “hard” problems when people pack boxes into trucks and solve Sudokus every day. But we consider an algorithm to have solved a problem efficiently only if it solves every instance efficiently, including very large ones. Of course, a computer can solve a 9 × 9 Sudoku faster than a one-million-by-one-million Sudoku, so the rigorous definition of “efficient” appeals to how the time required to solve a problem scales with the size of the input.

The P-versus-NP question concerns a variety of computational problems and how they relate to one another, so it may seem like a resolution would require investigating each of those problems individually. Say you were to find an efficient algorithm for the traveling salesperson problem. This discovery would be a heroic breakthrough, but would it tell you anything about your ability to solve huge Sudokus or any other challenging NP problem? Amazingly, your algorithm for that single problem would fully resolve P versus NP.

In 1972 computer scientist Richard M. Karp published a seminal paper demonstrating that 21 classic NP problems have a remarkable property: an efficient algorithm for solving any one of them could be used not only to solve the other 20 but to solve every problem in NP. He called these 21 problems NP-complete. In the intervening years, that list has grown as researchers discovered many other NP problems that share this magic property (including the traveling salesperson one).

We can view NP-completeness with optimism or pessimism. On the optimistic hand, a fortress of monstrously difficult problems standing between us and untold technological promise now looks more like a house of cards. Yank one into the realm of feasibility, and the entire NP edifice collapses; then a scientific revolution rises from the rubble, filled with effortlessly efficient travel, rapid drug discovery via protein folding, and a new age of mathematics. On the pessimistic hand, NP-completeness suggests that these problems do not have efficient algorithms; if proving otherwise amounts to conquering just a single problem, then why hasn’t anybody succeeded yet? Most experts lean toward the latter interpretation and suspect that NP-complete problems don’t have fast algorithms.

Whether the glass is viewed as half full or half empty, the concept of complete problems has changed the way researchers view computation. Karp showed that he could use an algorithm for one NP-complete problem to solve another by first demonstrating that you can translate seemingly unrelated problems into one another’s language by using a process called a reduction. It works by showing you how to take any instance of one problem and convert it into another problem. For instance, if you start with a problem that involves a list of cities, flights between them and a budget, you can reduce it to a large Sudoku puzzle in such a way that the Sudoku has a valid solution only if it’s possible to visit all the cities while staying within the budget (and doesn’t have a valid solution otherwise). That way, if you discovered an efficient algorithm for Sudoku, then you could use it to also solve the traveling salesperson problem by converting instances of the latter into Sudoku puzzles.

How Does Reduction Work?

For a deeper dive into how reduction works, let’s reduce another type of NP-complete problem, the “map three-coloring problem,” to the “clique problem.” The map three-coloring problem asks: Given a map, can you assign one of three colors to each region so that no neighboring regions have the same color? And the clique problem asks: Does a given social network contain a group of a desired number of people who are all mutual friends? Both problems are NP-complete, meaning we don’t know any efficient algorithm for either of them. On the surface, they have little in common. But I’ll show that given a map, we can transform it into a social network in such a way that the answer to the social network problem will give us the answer to the map problem.

Picture a U.S. map. To build a social network out of it, designate three “people” for every state, one for each of three colors: blue, green and red. Then make two people friends unless:

They represent the same state (the green Wisconsin representative will not be friends with the blue Wisconsin one) or
They have the same color and represent neighboring states (North Dakota and South Dakota share a border, so their red representatives will not be friends, but North Dakota and Florida don’t, so their red representatives will be friends).

I claim that this social network of 150 people will contain a clique of 50 mutual friends only if the U.S. map has a valid three-coloring. If we find 50 mutual friends in the network, they must all represent different states because in our design we didn’t make people friends when they represented the same state. Furthermore, the coloring that corresponds to the clique would never result in neighboring states having the same color—we explicitly forbade such links in the network. So a clique of 50 people would correspond to a valid three-coloring. Likewise, if no 50-clique exists in the network, then no three-coloring exists for the map.

We just reduced the map three-coloring problem to the clique problem. This means if somebody discovered a fast algorithm for the clique problem, they could use it to solve any instance of the map three-coloring problem. Critically, the first step—transforming the map into a network—is fast. Creating the people in the network and the appropriate friendship relations does not require any exhaustive search or other infeasible computational overhead.

Reductions show that even if our problems seem one of a kind, they may be more universal than they appear. A web of reductions unites all NP-complete problems. Solve any one of them, and you can solve any other NP problem.

This ability to encode one problem by using the language of another is also a feature of computation itself. The implications boggle the mind. Remember we can frame proving mathematical theorems as an NP-complete problem. Pick any famous unsolved math question. The theory of NP-completeness tells us that there exists some level of Candy Crush that perfectly encodes that question. If a certain score is achievable in a certain number of moves on that level of Candy Crush, then your math problem has a proof of a certain length; otherwise it doesn’t. NP-completeness also assures us that certain advances in protein folding (or box packing or Sudoku solving) would destroy the digital economy. That’s because the encryption that protects our sensitive data works by vaulting them behind computational problems believed to be intractable.

It’s worth noting that although solving an NP-complete problem would allow you to break encryption, the reverse is not true; the intractable problems underlying most encryption schemes are not quite NP-complete themselves.

With so much riding on NP-complete problems, a million bucks might seem like a bargain for their solution. And it might offer a bit of added motivation the next time you struggle to schedule your vacation trip or crack a Sudoku puzzle.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American