Table of Contents
Fetching ...

Towards Hybrid Embedded Feature Selection and Classification Approach with Slim-TSF

Anli Ji, Chetraj Pandey, Berkay Aydin

TL;DR

This research aims to uncover hidden relationships and the evolutionary characteristics of solar flares and their source regions, and suggests that the systematic evaluation and feature selection approach can significantly advance the predictive accuracy of solar flare forecasting models.

Abstract

Traditional solar flare forecasting approaches have mostly relied on physics-based or data-driven models using solar magnetograms, treating flare predictions as a point-in-time classification problem. This approach has limitations, particularly in capturing the evolving nature of solar activity. Recognizing the limitations of traditional flare forecasting approaches, our research aims to uncover hidden relationships and the evolutionary characteristics of solar flares and their source regions. Our previously proposed Sliding Window Multivariate Time Series Forest (Slim-TSF) has shown the feasibility of usage applied on multivariate time series data. A significant aspect of this study is the comparative analysis of our updated Slim-TSF framework against the original model outcomes. Preliminary findings indicate a notable improvement, with an average increase of 5\% in both the True Skill Statistic (TSS) and Heidke Skill Score (HSS). This enhancement not only underscores the effectiveness of our refined methodology but also suggests that our systematic evaluation and feature selection approach can significantly advance the predictive accuracy of solar flare forecasting models.

Towards Hybrid Embedded Feature Selection and Classification Approach with Slim-TSF

TL;DR

This research aims to uncover hidden relationships and the evolutionary characteristics of solar flares and their source regions, and suggests that the systematic evaluation and feature selection approach can significantly advance the predictive accuracy of solar flare forecasting models.

Abstract

Traditional solar flare forecasting approaches have mostly relied on physics-based or data-driven models using solar magnetograms, treating flare predictions as a point-in-time classification problem. This approach has limitations, particularly in capturing the evolving nature of solar activity. Recognizing the limitations of traditional flare forecasting approaches, our research aims to uncover hidden relationships and the evolutionary characteristics of solar flares and their source regions. Our previously proposed Sliding Window Multivariate Time Series Forest (Slim-TSF) has shown the feasibility of usage applied on multivariate time series data. A significant aspect of this study is the comparative analysis of our updated Slim-TSF framework against the original model outcomes. Preliminary findings indicate a notable improvement, with an average increase of 5\% in both the True Skill Statistic (TSS) and Heidke Skill Score (HSS). This enhancement not only underscores the effectiveness of our refined methodology but also suggests that our systematic evaluation and feature selection approach can significantly advance the predictive accuracy of solar flare forecasting models.
Paper Structure (13 sections, 4 equations, 3 figures)

This paper contains 13 sections, 4 equations, 3 figures.

Figures (3)

  • Figure 1: An illustration of the sliding window-based statistical feature generation. We first generate subsequences (intervals) with a fixed-size sliding window and step size. Then, we create vectorized features from these intervals where these features can be used as input for the sliding window multivariate time series forest (a random forest built on multivariate time series features) and features are ranked with aggregated relevance scores.
  • Figure 2: Error Bar representation of slim-TSF evaluation with ex-ante bootstrapping feature selection using different class weight (i.e., cw) ratio. The most relevant features are selected (per each model trained) across different class weights using the log-scale filter. The TSS and HSS scores are shown for each bootstrapping experiment.
  • Figure 3: A bar plot representation of feature participation ratio in three bootstrap evaluation counts. All features from sliding window intervals and transformed features are used.