Table of Contents
Fetching ...

Detecting and Classifying Flares in High-Resolution Solar Spectra with Supervised Machine Learning

Nicole Hao, Laura Flagg, Ray Jayawardhana

TL;DR

The paper addresses solar flare contamination in high-resolution spectra and its implications for exoplanet transmission spectroscopy by building a standardized supervised-learning workflow that combines HARPS-N solar spectra with RHESSI flare labels. It preprocesses the data with normalization, noise filtering, and PCA, labels spectra into three flare categories, and assesses multiple classifiers, identifying SVC with an RBF kernel as the best performer. The model achieves a final aggregate accuracy of about $0.65$ and class-wise accuracies of $0.64$, $0.77$, and $0.56$, with notable improvements for weak flares after hyperparameter tuning. The work demonstrates robustness to new data and outlines future directions, including deep learning approaches and extending the framework to flare classification in exoplanet host stars for improved spectral correction.

Abstract

Flares are a well-studied aspect of the Sun's magnetic activity. Detecting and classifying solar flares can inform the analysis of contamination caused by stellar flares in exoplanet transmission spectra. In this paper, we present a standardized procedure to classify solar flares with the aid of supervised machine learning. Using flare data from the RHESSI mission and solar spectra from the HARPS-N instrument, we trained several supervised machine learning models, and found that the best performing algorithm is a C-Support Vector Machine (SVC) with non-linear kernels, specifically Radial Basis Functions (RBF). The best-trained model, SVC with RBF kernels, achieves an average aggregate accuracy score of 0.65, and categorical accuracy scores of over 0.70 for the no-flare and weak-flare classes, respectively. In comparison, a blind classification algorithm would have an accuracy score of 0.33. Testing showed that the model is able to detect and classify solar flares in entirely new data with different characteristics and distributions from those of the training set. Future efforts could focus on enhancing classification accuracy, investigating the efficacy of alternative models, particularly deep learning models, and incorporating more datasets to extend the application of this framework to stars that host exoplanets.

Detecting and Classifying Flares in High-Resolution Solar Spectra with Supervised Machine Learning

TL;DR

The paper addresses solar flare contamination in high-resolution spectra and its implications for exoplanet transmission spectroscopy by building a standardized supervised-learning workflow that combines HARPS-N solar spectra with RHESSI flare labels. It preprocesses the data with normalization, noise filtering, and PCA, labels spectra into three flare categories, and assesses multiple classifiers, identifying SVC with an RBF kernel as the best performer. The model achieves a final aggregate accuracy of about and class-wise accuracies of , , and , with notable improvements for weak flares after hyperparameter tuning. The work demonstrates robustness to new data and outlines future directions, including deep learning approaches and extending the framework to flare classification in exoplanet host stars for improved spectral correction.

Abstract

Flares are a well-studied aspect of the Sun's magnetic activity. Detecting and classifying solar flares can inform the analysis of contamination caused by stellar flares in exoplanet transmission spectra. In this paper, we present a standardized procedure to classify solar flares with the aid of supervised machine learning. Using flare data from the RHESSI mission and solar spectra from the HARPS-N instrument, we trained several supervised machine learning models, and found that the best performing algorithm is a C-Support Vector Machine (SVC) with non-linear kernels, specifically Radial Basis Functions (RBF). The best-trained model, SVC with RBF kernels, achieves an average aggregate accuracy score of 0.65, and categorical accuracy scores of over 0.70 for the no-flare and weak-flare classes, respectively. In comparison, a blind classification algorithm would have an accuracy score of 0.33. Testing showed that the model is able to detect and classify solar flares in entirely new data with different characteristics and distributions from those of the training set. Future efforts could focus on enhancing classification accuracy, investigating the efficacy of alternative models, particularly deep learning models, and incorporating more datasets to extend the application of this framework to stars that host exoplanets.
Paper Structure (13 sections, 5 equations, 9 figures, 1 table)

This paper contains 13 sections, 5 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Solar spectrum, a single observation
  • Figure 2: Plot of normalized flux against wavelengths
  • Figure 3: First 10 Principal Components of Normalized Solar Spectra
  • Figure 4: Original solar flares data proportion (left) Balanced solar flares data (right)
  • Figure 5:
  • ...and 4 more figures