Table of Contents
Fetching ...

Solar Transient Recognition Using Deep Learning (STRUDL) for heliospheric imager data

Maike Bauer, Justin Le Louëdec, Tanja Amerstorfer, Luke Barnard, David Barnes, Helmut Lammer

TL;DR

This work addresses automated detection and tracking of coronal mass ejections in heliospheric imager data using STRUDL, a pipeline that combines a 3D U-Net-based segmentation model with a post-processing tracking algorithm. By processing temporal sequences of HI1 images and linking detections across frames, STRUDL generates time-distance CME tracks and evaluates performance against ground-truth catalogs (HICAT/HIGeoCAT) and HELCATS data. The results show feasible segmentation with IoU around 0.35 and Dice ~0.52, and event-based tracking achieving a precision of 0.87 but a recall of 0.56, with start- and end-time errors of about 1.42 h and 4.71 h respectively; continuous tracking exposes greater challenges during solar maximum due to complexity and catalog limitations. The study highlights the promise of ML-based CME detection while identifying areas for improvement, such as annotation strategies for full CME structures, advanced tracking methods, and adapting the approach for real-time beacon data in future missions.

Abstract

Coronal Mass Ejections (CMEs) are space weather phenomena capable of causing significant disruptions to both space- and ground-based infrastructure. The timely and accurate detection and prediction of CMEs is a crucial steps towards implementing strategies to minimize the impacts of such events. CMEs are commonly observed using coronagraphs and heliospheric imagers (HIs), with some forecasting methods relying on manually tracking CMEs across successive images in order to provide an estimate of their arrival time and speed. This process is time-consuming and results may exhibiting considerable interpersonal variation. We investigate the application of machine learning (ML) techniques to the problem of automated CME detection, focusing on data from the HI instruments aboard the STEREO spacecraft. HI data facilitates the tracking of CMEs through interplanetary space, providing valuable information on their evolution. Building on advances in image segmentation, we present the Solar Transient Recognition Using Deep Learning (STRUDL) model. STRUDL is designed to automatically detect and segment CME fronts in HI data. We address the challenges inherent to this task and evaluate the model's performance across a range of solar activity conditions. To complement segmentation, we implement a basic tracking algorithm that links CME detections across successive frames, thus allowing us to automatically generate time-distance profiles. Our results demonstrate the feasibility of applying ML-based segmentation techniques to HI data, while highlighting areas for future improvement, particularly regarding the accurate segmentation and tracking of faint and interacting CMEs.

Solar Transient Recognition Using Deep Learning (STRUDL) for heliospheric imager data

TL;DR

This work addresses automated detection and tracking of coronal mass ejections in heliospheric imager data using STRUDL, a pipeline that combines a 3D U-Net-based segmentation model with a post-processing tracking algorithm. By processing temporal sequences of HI1 images and linking detections across frames, STRUDL generates time-distance CME tracks and evaluates performance against ground-truth catalogs (HICAT/HIGeoCAT) and HELCATS data. The results show feasible segmentation with IoU around 0.35 and Dice ~0.52, and event-based tracking achieving a precision of 0.87 but a recall of 0.56, with start- and end-time errors of about 1.42 h and 4.71 h respectively; continuous tracking exposes greater challenges during solar maximum due to complexity and catalog limitations. The study highlights the promise of ML-based CME detection while identifying areas for improvement, such as annotation strategies for full CME structures, advanced tracking methods, and adapting the approach for real-time beacon data in future missions.

Abstract

Coronal Mass Ejections (CMEs) are space weather phenomena capable of causing significant disruptions to both space- and ground-based infrastructure. The timely and accurate detection and prediction of CMEs is a crucial steps towards implementing strategies to minimize the impacts of such events. CMEs are commonly observed using coronagraphs and heliospheric imagers (HIs), with some forecasting methods relying on manually tracking CMEs across successive images in order to provide an estimate of their arrival time and speed. This process is time-consuming and results may exhibiting considerable interpersonal variation. We investigate the application of machine learning (ML) techniques to the problem of automated CME detection, focusing on data from the HI instruments aboard the STEREO spacecraft. HI data facilitates the tracking of CMEs through interplanetary space, providing valuable information on their evolution. Building on advances in image segmentation, we present the Solar Transient Recognition Using Deep Learning (STRUDL) model. STRUDL is designed to automatically detect and segment CME fronts in HI data. We address the challenges inherent to this task and evaluate the model's performance across a range of solar activity conditions. To complement segmentation, we implement a basic tracking algorithm that links CME detections across successive frames, thus allowing us to automatically generate time-distance profiles. Our results demonstrate the feasibility of applying ML-based segmentation techniques to HI data, while highlighting areas for future improvement, particularly regarding the accurate segmentation and tracking of faint and interacting CMEs.

Paper Structure

This paper contains 12 sections, 5 equations, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: Sequences of STEREO-A HI1 images from January 2010 (A) and May 2010 (B). The data have been post-processed and subsequently reprocessed into running difference images. The manually annotated CME front is overlaid in red.
  • Figure 2: Example of post-processing steps to obtain consensus masks from the Solar Stormwatch II citizen science project, which we use as a source of additional data to train, validate, and test STRUDL. The leftmost column shows the summed annotations plotted over the corresponding STEREO/HI running difference image. The middle column shows the same annotations after application of a Gaussian filter with $\sigma = 8$, followed by a Sato filter with $\sigma \in [1,10]$. In the rightmost column, the final binary consensus masks, obtained after normalizing the image and applying a threshold $t = 0.3$, are shown.
  • Figure 3: Schematic illustration of our 3D UNet. The diagram illustrates the sequence of operations applied to the input data as it propagates through the network. The left-hand side of the network corresponds to the encoder path, the right-hand side to the decoder path. Arrows indicate the direction of the flow of the data. Some arrows are marked with a square, showing the presence of a dropout layer. Light orange blocks represent convolutions, while darker orange bands represent the ReLU activation function. The numbers below the convolutions indicate the size of the feature space after the convolution operation. Red and blue blocks indicate 3D max pooling and unpooling operations, respectively. Purple blocks indicate transpose convolutions. The terminal layer (green) normalizes the network's output between 0 and 1 using a sigmoid function. The purple arrows between the encoder and decoder paths of the network show the skip connections.
  • Figure 4: Key metrics for segmentation performance achieved using different aggregation methods (colored), averaged for all 5 models. The leftmost panel shows precision plotted against recall. The middle panel and rightmost panel display the IoU and Dice Score, respectively, at various binarization thresholds ranging from 0.05 to 0.95.
  • Figure 5: Sequences of images showcasing the model's segmentation performance using the best method-threshold combination threshold(mean) = 0.45. The overlap between the predicted front and the ground truth mask is shown. Each pixel belonging to the front is classified as a TP (purple), FP (pink), or FN (orange). Panel (A) shows a single CME front close to the Sun, while panel (B) shows two CME fronts, one being further away from the Sun and thus fainter than the other one.
  • ...and 3 more figures