Table of Contents
Fetching ...

Solar Active Regions Emergence Prediction Using Long Short-Term Memory Networks

Spiridon Kasapis, Irina N. Kitiashvili, Alexander G. Kosovichev, John T. Stefan

Abstract

We developed Long Short-Term Memory (LSTM) models to predict the formation of active regions (ARs) on the solar surface. Using the Doppler shift velocity, the continuum intensity, and the magnetic field observations from the Solar Dynamics Observatory (SDO) Helioseismic and Magnetic Imager (HMI), we have created time-series datasets of acoustic power and magnetic flux, which are used to train LSTM models on predicting continuum intensity, 12 hours in advance. These novel machine learning (ML) models are able to capture variations of the acoustic power density associated with upcoming magnetic flux emergence and continuum intensity decrease. Testing of the models' performance was done on data for 5 ARs, unseen from the models during training. Model 8, the best performing model trained, was able to make a successful prediction of emergence for all testing active regions in an experimental setting and three of them in an operational. The model predicted the emergence of AR11726, AR13165, and AR13179 respectively 10, 29, and 5 hours in advance, and variations of this model achieved average RMSE values of 0.11 for both active and quiet areas on the solar disc. This work sets the foundations for ML-aided prediction of solar ARs.

Solar Active Regions Emergence Prediction Using Long Short-Term Memory Networks

Abstract

We developed Long Short-Term Memory (LSTM) models to predict the formation of active regions (ARs) on the solar surface. Using the Doppler shift velocity, the continuum intensity, and the magnetic field observations from the Solar Dynamics Observatory (SDO) Helioseismic and Magnetic Imager (HMI), we have created time-series datasets of acoustic power and magnetic flux, which are used to train LSTM models on predicting continuum intensity, 12 hours in advance. These novel machine learning (ML) models are able to capture variations of the acoustic power density associated with upcoming magnetic flux emergence and continuum intensity decrease. Testing of the models' performance was done on data for 5 ARs, unseen from the models during training. Model 8, the best performing model trained, was able to make a successful prediction of emergence for all testing active regions in an experimental setting and three of them in an operational. The model predicted the emergence of AR11726, AR13165, and AR13179 respectively 10, 29, and 5 hours in advance, and variations of this model achieved average RMSE values of 0.11 for both active and quiet areas on the solar disc. This work sets the foundations for ML-aided prediction of solar ARs.
Paper Structure (8 sections, 2 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 8 sections, 2 equations, 9 figures, 5 tables, 1 algorithm.

Figures (9)

  • Figure 1: Diagram of the research data processing sequence: 61 ARs have been tracked from the SDO/HMI full disk $V_D$, $\Phi_m$ and $I_c$ maps. The tracked regions were split into smaller tiles, and the timeline datasets were created by averaging the values of each tile. The timelines are used as inputs during the training and validation/testing process of different LSTM models, such as Model 8 (M8) seen in red. The architecture of Model 8 is presented in the area enclosed by the red dotted line.
  • Figure 2: RMSE obtained during the testing of Model 8 variations which differ in the number of inputs (panel a), layers (b), units (c), epochs (d), outputs (f), and the leaning rate (e). The green lines represent the mean RMSE for the tiles in which emergence has been observed, whereas the red lines represent the mean RMSE for the non-emerging tiles. The red and green shadows represent the standard deviation. All models were evaluated on the standard 12-hour prediction problem using 96 hours worth of input data.
  • Figure 3: Evaluation of Model 8 on predicting variations of the $I_c$ for selected tiles of AR13179 (tiles 38-44). The tiles' locations are marked in red squares at the bottom right continuum intensity images. Each tile's corresponding observed and predicted mean continuum intensity variations are shown as orange and blue curves. The time-derivatives of the continuum intensity (observed and predicted) are shown color-coded according to the Equation \ref{['eq:criterion']} criteria. In red are the emergence periods while in green are the non-emerging (quiet) states. Tiles 40-42 are 'active' because a decrease in the continuum intensity is observed, while tile 41 exhibits the first signatures of AR13179 emergence. The rest of the tiles are quiet and correspond to non-emerging states. Vertical dashed lines identify the following moments: NOAA's first record of the active region (magenta), the time two days after NOAA's first record (purple), the time when Model 8 produces its first emergence alarm (First Warning, black) and the time when the observed emergence starts (Emergence Start, red). The tiles for which instead of dashed, the First Warning and Emergence lines are solid, are the tiles in which these events took place.
  • Figure 4: Comparison of the continuum intensity predictions obtained using Models 1, 2, and 8 (upper plots), the corresponding intensity derivatives ($dPred/dt$) color-coded according to the Equation \ref{['eq:criterion']} criterion (plots in the middle), and the variations of mean acoustic power and unsigned magnetic flux (plots in the bottom) for two ILAP cases near AR11698 (left) and AR13179 (right). These selected tiles marked in red squares over the continuum intensity images and 2-3 mHz power maps on the bottom of the figure, corresponding to about two days after emergence.
  • Figure 5: Evaluation of Model 8 on predicting variations of the $I_c$ for the selected tiles of AR11698 (tiles 47-53). The tiles' locations are marked in red squares at the bottom right continuum intensity images. Each tile's corresponding observed and predicted mean continuum intensity variations are shown as orange and blue curves. The time-derivatives of the continuum intensity (observed and predicted) are shown color-coded according to the Equation \ref{['eq:criterion']} criteria. In red are the emergence periods while in green are the non-emerging (quiet) states. Tiles 49-52 are 'active' because a decrease in the continuum intensity is observed, while tile 49 exhibits the first signatures of AR13179 emergence. The rest of the tiles are quiet and correspond to non-emerging states. Vertical dashed lines identify the following moments: NOAA's first record of the active region (magenta), the time two days after NOAA's first record (purple), the time when Model 8 produces its first emergence alarm (First Warning, black) and the time when the observed emergence starts (Emergence Start, red). The tiles for which instead of dashed, the First Warning and Emergence lines are solid, are the tiles in which these events took place.
  • ...and 4 more figures