After 80 years of fruitless struggle by human mathematicians, a major geometry conjecture has at last been solved—via a straightforward query to a chatbot.
The company OpenAI, maker of ChatGPT, announced the result yesterday, together with comments from a number of experts, who declared the artificial intelligence’s method “clever” and “elegant.” The achievement follows months of loudly reported but less impressive AI-powered advances in mathematics and marks a true milestone. Unlike all those previous feats, this result would merit publication in a top math journal, as well as major media attention, even if it were performed by humans alone.
“No previous AI-generated proof has come close” to meeting those high standards, wrote Tim Gowers, a mathematician at the University of Cambridge, in commentary solicited by OpenAI.
On supporting science journalism
If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
“This is the unique interesting result produced autonomously by AI so far,” says Daniel Litt, a mathematician at the University of Toronto, who has no involvement with OpenAI.
The “unit distance” problem is simple to explain but formidable to solve—a mathematician’s favorite quality.
Draw nine dots on a sheet of paper. The goal is to get as many pairs of dots as possible to be an inch apart. You can put them all in a line so that you have eight pairs separated by an inch. Or you can draw a three-by-three grid and count 12 pairs. For any number of dots, even billions or trillions, the problem asks: What’s the highest number of pairs you can get?
In 1946 mathematician Paul Erdős made a guess at the best strategy. It was the grid approach but with a much smaller spacing between dots, so pairs could be established across several grid points. Erdős showed that by using sophisticated mathematics to choose this spacing extremely carefully, you could do slightly better than a simple grid—but only slightly.
In fact, Erdős claimed that no one could do better. And despite valiant efforts, for eight decades, no one did. But no one managed to prove him right either, even though most experts agreed with his intuition.
That changed two weeks ago, when OpenAI mathematicians Mehtaab Sawhney and Mark Sellke—who have made headlines recently for using AI to solve a number of less prestigious “Erdős problems”—fed the conjecture to an internal large language model (LLM) trained for general reasoning. They asked it whether Erdős was right. After churning out hundreds of pages of careful logic and calculations, it beat his long-standing record.
“It feels like magic,” Sawhney says. “It’s kind of an amazing experience to have a machine give back something which really resembles how I work.”
“What the model did is totally different from the ‘square grid’ construction,” Sellke says.
It instead constructed a more elaborate grid, one living in a kind of higher dimension. This higher-dimensional lattice of points had special mathematical symmetries that facilitate the separation of even more pairs by the same distance. The AI model then developed a way to map this otherworldly grid back down to the two-dimensional page, producing a flattened numerical “shadow.” The result is far from a grid, and Sawhney says it’s too difficult to actually draw on paper, even for a small number of dots.
The AI did not prove that its approach is the best anyone can do, though. In fact, mathematician Will Sawin has already improved upon the AI’s grid.
OpenAI privately contacted Litt, Sawin and a number of other mathematicians to verify the LLM’s proof. Together (and without the company’s direct involvement), they wrote up their individual takeaways. (No external experts have seen the AI’s original output, however—just an edited version of its train of thought.)
What stood out, they said, was the AI’s preternatural patience and focus. Human experts, largely agreeing with Erdős’s thinking, had spent more effort over the years trying to prove rather than disprove the conjecture. And even those few who looked for a counterexample would be unlikely to follow such a difficult and tedious path—constructing this high-dimensional shape—without any enticing hint of success. But an LLM experiences the costs and benefits of trial and error differently.
“AIs have an edge: It’s not just that they can try all known methods,” says Jacob Tsimerman, a mathematician at the University of Toronto, who was not involved in the work. “They can play for longer and in more treacherous waters than mathematicians without getting overwhelmed.”
Several of the experts consulted by OpenAI noted that while the unit distance problem was well known, a proof that Erdős was right would have been far more mathematically rich than a counterexample. Such proofs usually necessitate totally new insights that can then be applied to a wider range of problems. The mathematical tools the AI used here are not novel, although their application in this domain appears to be. “The model did not invent something fundamentally new that nobody saw coming,” says Sébastien Bubeck, a mathematician leading OpenAI’s mathematical explorations. “It just executed like an amazing mathematician.”
The experts also hastened to add that, without humans intervening to “clean up” the AI’s work, the result wouldn’t be so convincing. “The human still plays a vital role in discussing, digesting, and improving this proof, and exploring its consequences,” wrote mathematician Thomas Bloom in the “reflections” document.
Harvard University mathematician Melanie Matchett Wood wrote in her commentary that if the assembled human experts had combined their efforts for the same amount of time it took them to simply parse the LLM’s answer, “the mathematicians would have found a counterexample.”
This is plausible because the AI’s solution was, in hindsight, a straightforward approach that no human had ever attempted despite the fact that the tools had already existed. Such circumstances are thought to be uncommon for major unsolved math problems. “I guess it got lucky that it found one of the cases where experts tried and missed something,” Litt says. Genuinely new, groundbreaking ideas remain beyond the reach of current LLMs, instead leaving the machines to mine the literature for rare gems where humans missed a relatively simple approach. Even so, Litt adds, “my guess is we’re about to find out they’re actually not that rare.”
In her commentary, Wood also warned of AI’s less desirable traits as a mathematician, such as its tendency to present every idea as its own. “Our professional norms require us to cite previous work whose ideas influenced our work,” she wrote. “ChatGPT is in some sense ‘familiar’ with all the previous work.”
What this will do to mathematics—a field currently populated by humans driven by a thirst for knowledge and a passion for the elegant beauty of mathematical truth—is the bigger question, Wood concluded. “We urgently need to plan for how we can keep our work rigorous and correct,” she wrote.

