On April 7, particle physicists all over the world were excited and energized by the announcement of a measurement of the behavior of muons—the heavier, unstable subatomic cousins of electrons—that differed significantly from the expected value.

A century from now, looking back on this moment, will historians understand this excitement? They certainly won’t see a major turning point in the history of science. No puzzle was solved, no new particle or field was discovered, no paradigm shifted in our picture of nature. What happened on April 7 was just an announcement that the muon’s wobble—its value is called g-2—had been measured a little more precisely than before, and that the international high-energy physics community was therefore a little more confident that other particles and fields are out there yet to be discovered.

Nevertheless, historians of science will see this as a special moment, not because of the measurement but because of the measuring. The first results of the experiment at Fermilab was the outcome of a remarkable and perhaps even unprecedented set of interactions between an extraordinarily diverse set of scientific cultures that, over 60 years, evolved independently yet required each other.

Early theoretical calculations of g-2 according to quantum electrodynamics received a jolt in 1966 when Cornell theorist Toichiro Kinoshita realized that his previous studies had well-prepared him to work out its value. His first calculations were by hand, but soon his calculations became too unwieldy to be performed that way and he became dependent on computers and special software. To make the prediction ever more precise, he had to incorporate work by different groups of theorists who specialized in the vast and diverse panoply of interacting particles and forces that subtly influence the g-2 value. (Kinoshita is retired, and today the theoretical value is worked on by more than 100 physicists.) The result was a specific prediction, relying on the contributions of many theorists, with a minuscule error bar that made a clear experimental target.

The initial experimental work on a g-2 measurement, which began at CERN in 1959, involved a multistep process. The experimenters used a particle accelerator to make unstable particles called pions, then channeled these into a flat magnet where the pions decayed into muons. The muons were forced to turn in circles, and the whirling muons were made to “walk” in steps down the magnet. The muons emerged from the other end of the magnet into a field-free region where their orientation could be measured, allowing the experimenters to infer their g-2.

The next experiment, which started at CERN in 1966, used a more powerful accelerator to produce and inject larger numbers of pions into a five-meter-diameter storage ring with a magnetic gradient to contain the resulting hordes of muons. The third CERN experiment, which began operations in 1969, was a major leap forward. It used a much larger 14-meter-diameter storage ring and ran at a certain “magic” energy where the electric field would not affect the muon spin. This made it possible to have a uniform magnetic field, dramatically sharpening the sensitivity of the measurement. But with that enhanced sensitivity came new sources of precision-sabotaging instrumental noise; another set of methods had to be applied to reduce uncertainties in the magnets and to measure the magnetic field.

The fourth generation of g-2 experiments—begun at Brookhaven National Laboratory in 1999—required even more years of laborious struggle to beat back sources of error and control various disruptive factors. Like the third CERN experiment, it used a storage ring, 3.1 giga-electron-volt muons, the magic energy, and a uniform field; but unlike the CERN experiment it had a higher flux, muon injection rather than pion injection, superconducting magnets, a trolley equipped with NMR probes that could be run around inside the vacuum chamber to check the magnetic field, and a kicker inside the storage ring.

These and other features added to the experiment’s complexity and expense. The experiment involved 60 physicists from 11 institutes; it issued its g-2 value in 2004. In 2013, the Brookhaven g-2 storage ring was transported to Fermilab and given new life, rebuilt and operated with a host of ever-more subtle and sophisticated new tricks needed to further push the outer limits of precision. Ultimately, all those overlapping decades of work collectively produced the measurement announced this month, one with a tiny error bar that made it meaningful to compare with the theoretical prediction, which by then also had a narrow error bar.

The late Francis Farley, the spokesperson for the very first g-2 experiment at CERN, once told me, “What the theorists do and what we experimenters do is completely different. They talk about Feynman diagrams, amplitudes, integrals, expansions and a whole lot of complex mathematics. We hook up an accelerator to beam lines and steering magnets to the device itself, which is stuffed with wires, thousands of cables, timing devices, sensors and such things. It’s two totally different worlds! But they come out with a number, we come out with a number, and these numbers agree to parts per million! It's unbelievable! That is the most astonishing thing for me!”

At the April 7 announcement, the participating physicists displayed a graph with two error bars, one for the theoretical prediction and the other for the experimental measurement. All the excitement sprang from the tiny but indisputable gap—2.5 parts per billion—between the two. If either bar had been wider, it would have blended into the other, and the measurement would not have indicated physics awaiting discovery. To make the experiment happen, the scientific community, and the government agencies providing the funding, had placed enormous trust in the international team of collaborators.

What will amaze historians of science in the future, I think, will be that today’s scientists could produce that puny but revealing gap at all.

This is an opinion and analysis article.