Table of Contents
Fetching ...

Unveiling the Potential of Deep Learning Models for Solar Flare Prediction in Near-Limb Regions

Chetraj Pandey, Rafal A. Angryk, Berkay Aydin

TL;DR

This work tackles solar flare prediction in near-limb regions by leveraging full-disk LoS magnetograms and a 24-hour prediction window. It compares three transfer-learned CNN architectures—AlexNet, VGG16, and ResNet34—on a large, imbalanced dataset from SDO/HMI and performs a detailed spatial recall analysis across central and near-limb zones using 4-fold cross-validation. Key findings show that AlexNet offers the strongest overall skill (TSS and HSS), while ResNet34 excels in near-limb recall, enabling reliable predictions in challenging limb areas despite projection distortions. The results advance operational forecasting by demonstrating tangible near-limb predictive capability and point to future work in multi-modal and temporally aware modeling for solar activity.

Abstract

This study aims to evaluate the performance of deep learning models in predicting $\geq$M-class solar flares with a prediction window of 24 hours, using hourly sampled full-disk line-of-sight (LoS) magnetogram images, particularly focusing on the often overlooked flare events corresponding to the near-limb regions (beyond $\pm$70$^{\circ}$ of the solar disk). We trained three well-known deep learning architectures--AlexNet, VGG16, and ResNet34 using transfer learning and compared and evaluated the overall performance of our models using true skill statistics (TSS) and Heidke skill score (HSS) and computed recall scores to understand the prediction sensitivity in central and near-limb regions for both X- and M-class flares. The following points summarize the key findings of our study: (1) The highest overall performance was observed with the AlexNet-based model, which achieved an average TSS$\sim$0.53 and HSS$\sim$0.37; (2) Further, a spatial analysis of recall scores disclosed that for the near-limb events, the VGG16- and ResNet34-based models exhibited superior prediction sensitivity. The best results, however, were seen with the ResNet34-based model for the near-limb flares, where the average recall was approximately 0.59 (the recall for X- and M-class was 0.81 and 0.56 respectively) and (3) Our research findings demonstrate that our models are capable of discerning complex spatial patterns from full-disk magnetograms and exhibit skill in predicting solar flares, even in the vicinity of near-limb regions. This ability holds substantial importance for operational flare forecasting systems.

Unveiling the Potential of Deep Learning Models for Solar Flare Prediction in Near-Limb Regions

TL;DR

This work tackles solar flare prediction in near-limb regions by leveraging full-disk LoS magnetograms and a 24-hour prediction window. It compares three transfer-learned CNN architectures—AlexNet, VGG16, and ResNet34—on a large, imbalanced dataset from SDO/HMI and performs a detailed spatial recall analysis across central and near-limb zones using 4-fold cross-validation. Key findings show that AlexNet offers the strongest overall skill (TSS and HSS), while ResNet34 excels in near-limb recall, enabling reliable predictions in challenging limb areas despite projection distortions. The results advance operational forecasting by demonstrating tangible near-limb predictive capability and point to future work in multi-modal and temporally aware modeling for solar activity.

Abstract

This study aims to evaluate the performance of deep learning models in predicting M-class solar flares with a prediction window of 24 hours, using hourly sampled full-disk line-of-sight (LoS) magnetogram images, particularly focusing on the often overlooked flare events corresponding to the near-limb regions (beyond 70 of the solar disk). We trained three well-known deep learning architectures--AlexNet, VGG16, and ResNet34 using transfer learning and compared and evaluated the overall performance of our models using true skill statistics (TSS) and Heidke skill score (HSS) and computed recall scores to understand the prediction sensitivity in central and near-limb regions for both X- and M-class flares. The following points summarize the key findings of our study: (1) The highest overall performance was observed with the AlexNet-based model, which achieved an average TSS0.53 and HSS0.37; (2) Further, a spatial analysis of recall scores disclosed that for the near-limb events, the VGG16- and ResNet34-based models exhibited superior prediction sensitivity. The best results, however, were seen with the ResNet34-based model for the near-limb flares, where the average recall was approximately 0.59 (the recall for X- and M-class was 0.81 and 0.56 respectively) and (3) Our research findings demonstrate that our models are capable of discerning complex spatial patterns from full-disk magnetograms and exhibit skill in predicting solar flares, even in the vicinity of near-limb regions. This ability holds substantial importance for operational flare forecasting systems.
Paper Structure (7 sections, 3 equations, 5 figures, 3 tables)

This paper contains 7 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: An annotated full-disk line-of-sight magnetogram observed on 2013-01-09 at 00:00:00 UTC as an example, showing the approximate central location (within $\pm$70$^{\circ}$) and near-limb (beyond $\pm$70$^{\circ}$ to $\pm$90$^{\circ}$) region with all the NOAA active regions (ARs) present at the noted timestamp. ARs in central and near-limb regions are indicated by blue and red flags respectively. Note that the directions East (E) and West (W) are reversed in solar coordinates.
  • Figure 2: An illustration of data labeling process for hourly observations of full-disk LoS magnetogram images with a prediction window of 24 hours. Here, 'FL' and 'NF' indicate 'Flare' and 'No Flare' classes. The gray-filled circles indicate hourly spaced timestamps for magnetogram instances.
  • Figure 3: Data distribution of four tri-monthly partitions for predicting $\geq$M1.0-class flares. Note that the length of the bars are in logarithmic scale.
  • Figure 4: A heatmap illustrating all three models' recall performance for $\geq$M-class flares i.e., FL-class. The locations of the flares (with maximum peak x-ray flux, used as labels) are aggregated into 5$^{\circ}$$\times$ 5$^{\circ}$ spatial bins of latitude and longitude. Note: White cells in the grid represent unavailable instances.
  • Figure 5: Individual heatmaps illustrating all three models' recall performance for subclasses in FL (a) X-class flares and (b) M-class flares. The spatially aggregated recall scores in Fig.\ref{['fig:hist_combined']} are isolated for two subclasses. White cells in the grid represent unavailable instances.