The so-called muon anomaly, first seen in an experiment at Brookhaven National Laboratory in 2001, hasn’t budged. For 20 years, this slight discrepancy between the calculated value of the muon’s magnetic moment and its experimentally determined one has lingered at a significance of about 3.7 sigma. That is a confidence level of 99.98 percent, or about a one-in-4,500 chance the discrepancy is a random fluctuation. With the just announced results from the Muon g-2 experiment at Fermi National Laboratory in Batavia, Ill., the significance has increased to 4.2 sigma. That is a confidence level of about 99.997 percent, or about a one-in-40,000 chance for the observed deviation to be a coincidence. By itself, the new Fermilab measurement has only 3.3 sigma significance, but because it reproduces the earlier finding from Brookhaven, the combined significance has risen to 4.2 sigma. Still, the latter falls short of particle physicists’ five-sigma discovery threshold.

The result has been long-awaited because of its potential to finally break the Standard Model of particle physics, a collection of the so far known fundamental constituents of matter that has been in place for about 50 years. This model presently contains a couple dozen particles, but most of them are unstable and therefore can’t be found just by looking at the matter that normally surrounds us. The unstable particles are, however, naturally produced in highly energetic events, such as when cosmic rays hit the upper atmosphere. They are also made in lab-created particle collisions, such as those used in the Fermilab experiments to measure the muon’s magnetic moment.

The muon was one of the first unstable particles known, with its discovery dating back to 1936. It is a heavier version of the electron, and like the latter particle, it is electrically charged. The muon has a lifetime of about two microseconds. For particle physicists, that’s a long time, which is why the particle lends itself to precision measurements. The muon’s magnetic moment determines how fast the particle’s spin axis orbits around magnetic field lines. To measure it at Fermilab, physicists create muons and keep them going in a circle of about 15 meters in diameter with powerful magnets. The muons eventually decay, and from the distribution of the decay products, one can infer the their magnetic moment.

The result is usually quoted as the “g-2,” where “g” is the magnetic moment. The “2” is included because the value is close to two—and in the deviations from two are the quantum contributions that physicists are interested in. These contributions come from vacuum fluctuations that contain all particles, albeit in virtual form: they only appear briefly before disappearing again. This means that if there are more particles than those in the Standard Model, they should contribute to the muon g-2—hence its relevance. A deviation from the Standard Model prediction could therefore mean that there are more particles than those that are currently known—or that there is some other new physics, such as additional dimensions of space.

So how are we to gauge the 4.2-sigma discrepancy between the Standard Model’s prediction and the new measurement? First of all, it is helpful to remember the reason that particle physicists use the five-sigma standard to begin with. The reason is not so much that particle physics is somehow intrinsically more precise than other areas of science or that particle physicists are so much better at doing experiments. It’s primarily that particle physicists have a lot of data. And the more data you have, the more likely you are to find random fluctuations that coincidentally look like a signal. Particle physicists began to commonly use the five-sigma criterion in the mid-1990s to save themselves from the embarrassment of having too many “discoveries” that later turn out to be mere statistical fluctuations.

But of course five sigma is an entirely arbitrary cut, and particle physicists also discuss anomalies well below that limit. Indeed, quite a few three- and four-sigma anomalies have come and gone over the years. The Higgs boson, for example, was already “discovered” in 1996, when a signal of about four sigma appeared at the Large Electron-Positron Collider (LEP) at CERN near Geneva—and then disappeared again. (The Higgs was conclusively detected in 2012 by LEP’s successor, the Large Hadron Collider, or LHC.) Also in 1996 quark substructures were found at around three sigma. They, too, disappeared.

In 2003 signs of supersymmetry—a conjectured extension of the Standard Model that introduces new particles—were seen at LEP, also at around three sigma. But soon they were gone. At the LHC in 2015, we saw the diphoton anomaly, which lingered around four sigma before it vanished. There have also been some stunning six-sigma discoveries that were not confirmed, such as the 1998 “superjets” at Fermilab’s Tevatron (even now no one really knows what they were) or the 2004 pentaquark sighting at the Hadron-Electron Ring Accelerator (HERA) accelerator in Germany (pentaquarks weren’t actually detected until 2015).

This history should help you gauge how seriously to take any particle physics claim with a statistical significance of 4.2 sigma. But of course the g-2 anomaly has in its favor the fact that its significance has gotten stronger rather than weaker.

What does the persistence of the anomaly mean? High-precision experiments at low energy, such as this one, complement high-energy experiments. They can provide similar information because, in principle, all the contributions from high energies are also present at low energies. It’s just that they are very small—we’re talking about a discrepancy between experiment and theory at the 11th digit after the decimal point.

In practice, this means that the calculations for the predictions have to exactly account for a lot of tiny contributions to reach the required precision. In particle physics, these calculations are done using Feynman diagrams—little graphs with nodes and links that denote particles and their interactions. They are a mathematical tool to keep track of which integrals must be calculated.

These calculations become more involved with higher precision because there are more and bigger diagrams. For the muon g-2, physicists had to calculate more than 15,000 diagrams. Although computers help greatly in the task, these calculations remain quite challenging. A particular headache is the hadronic contribution. Hadrons are composite particles made of several quarks held together by gluons. Calculating these hadronic contributions to the g-2 value is notoriously difficult, and it’s presently the largest source of error on the theoretical side. There are of course also various cross-measurements that play a role, such as the predictions that depend on the values of other constants, including the masses of leptons and coupling constants.

Thus, the discrepancy could rather mundanely mean that there’s something wrong with the Standard Model calculation, with the hadronic contributions as the primary suspect. But there is also the possibility that the shortcoming lies within the Standard Model itself and not our calculation. Maybe the discrepancy comes from new particles—supersymmetric particles are the most popular candidates. The problem with this explanation is that supersymmetry isn’t a model— instead it’s a property of a large number of models, with different models from that greater whole each yielding different predictions. Among other things, the g-2 contributions depend on the masses of the hypothetical supersymmetric particles, which are unknown. So for now it’s impossible to attribute the discrepancy to supersymmetry in particular.

Fermilab’s new high-precision measurement of the magnetic moment is a remarkable experimental achievement. But it’s too soon to declare the Standard Model broken.

*This is an opinion and analysis article.*