December 12, 2014

Forecasting the Sun's Fury: How Artificial Intelligence Can Predict Solar Flares

A couple of months ago, the sun sported the largest sunspot we've seen in the last 24 years. This monstrous spot, visible to the naked eye (that is, without magnification, but with protective eyewear of course), launched more than 100 flares.

By Monica Bobra

Join Our Community of Science Lovers!

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

A couple of months ago, the sun sported the largest sunspot we’ve seen in the last 24 years. This monstrous spot, visible to the naked eye (that is, without magnification, but with protective eyewear of course), launched more than 100 flares. The number of the spots on the sun ebbs and flows cyclically, every 11 years. Right now, the sun is in the most active part of this cycle: we’re expecting lots of spots and lots of flares in the coming months.

Usually, the media focuses on the destructive power of solar flares — the chance that, one day, a huge explosion on the sun will fling a ton of energetic particles our way and fry our communication satellites. But there’s less coverage on how we forecast these things, like the weather, so that we can prevent any potential damage. How do you forecast a solar flare, anyway?

One way is to use machine learning programs, which are a type of artificial intelligence that learns automatically from experience. These algorithms gradually improve their mathematical models every time new data come in. In order to learn properly, however, the algorithms require large sums of data. Scientists lacked any solar data on this scale before the 2010 launch of the Solar Dynamics Observatory (SDO), a sun-watching satellite that downlinks about a terabyte and a half of data every day—more than the most data of any other satellite in NASA history. Explore an interactive graphic showing where on the sun flares of different classes have been sighted over the years: Click image below.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Solar flares are notoriously complex. They occur in the solar atmosphere, above surface-dwelling sunspots. Sunspots, which generally come in pairs, act like bar magnets — that is, one spot acts like a north pole and the other like a south. Given that there are lots of sunspots, that various layers on the sun are rotating at different speeds, and that the sun itself has a north and south pole, the magnetic field in the solar atmosphere gets pretty messy. Like a rubber band, a really twisted magnetic field will eventually snap—and release a lot of energy in the process. That’s a solar flare. But sometimes twisted fields don’t flare, sometimes flares come from fairly innocuous-looking sunspots, and sometimes huge sunspots never do a thing.

We don't understand the physics of how solar flares occur. We have ideas — we know flares are certainly magnetic in nature—but we don't really know how they release so much energy so fast. In the absence of a definitive physical theory, the best hope for forecasting solar flares lies in scrutinizing our vast data set for observational clues.

There are two general ways to forecast solar flares: numerical models and statistical models. In the first case, we take the physics that we do know, code up the equations, run them over time, and get a forecast. In the second, we use statistics. We answer questions like: What’s the probability that an active region that’s associated with a huge sunspot will flare compared with one that’s associated with a small sunspot? As such, we build large data sets, full of features—such as the size of a sunspot, or the strength of its magnetic field—and look for relationships between these features and solar flares.

Machine learning algorithms can help to this end. We use machine learning algorithms everywhere. Biometric watches run them to predict when we should wake up. They’re better than doctors at predicting rare genetic disorders. They’ve identified paintings that have influenced artists throughout history. Scientists find machine learning algorithms so universally useful because they can identify non-linear patterns—basically every pattern that can’t be represented by straight lines—which is tough to do. But it’s important, because lots of patterns are non-linear.

We’ve used machine learning algorithms to forecast solar flares using SDO’s vast data set. To do this, we first built a database of all the active regions SDO has ever observed. Since it’s historical data, we already know if these active regions flared or not. The learning algorithm then analyzes active region features—such as the size of a sunspot, the strength of its associated magnetic field and the twistedness of these field lines—to identify general characteristics of flaring active regions.

To do this, the algorithm starts by making a guess. Let’s say its first guess is that a tiny sunspot with a weak magnetic field will produce a huge flare. Then it checks the answer. Whoops, nope. The algorithm then tweaks the way that it guesses. The next time around, it’ll make a different guess. Through trial and error—in the form of hundreds of thousands of guesses and checks—the algorithm figures out which features correspond to flaring active regions. Now, we have a self-taught algorithm that we can apply to real-time data.

Expanding such efforts could help us provide better notice of impending solar flares. So far, studies have found that machine-learning algorithms forecast flares better than or, at the worst, just as well as the numerical or statistical methods. This is kind of a phenomenal result in and of itself. These algorithms, which run without any human input whatsoever by simply looking for patterns in the data, and which are so general that you can use the same algorithm (on a different data set) to identify genetic disorders, can perform just as well as any other method used thus far to forecast solar flares.

And if we have more data? Who knows. Although we already have tons of data—SDO has been running for four and a half years—there haven’t been a ton of flares during that time. That’s because we’re in the quietest solar cycle of the century. That’s more reason to continue collecting data and keep the algorithms busy.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American