Table of Contents
Fetching ...

FSDR: A Novel Deep Learning-based Feature Selection Algorithm for Pseudo Time-Series Data using Discrete Relaxation

Mohammad Rahman, Manzur Murshed, Shyh Wei Teng, Manoranjan Paul

TL;DR

Experimental results demonstrate that FSDR outperforms three commonly used feature selection algorithms, taking into account a balance among execution time, $R^2$, and $RMSE$.

Abstract

Conventional feature selection algorithms applied to Pseudo Time-Series (PTS) data, which consists of observations arranged in sequential order without adhering to a conventional temporal dimension, often exhibit impractical computational complexities with high dimensional data. To address this challenge, we introduce a Deep Learning (DL)-based feature selection algorithm: Feature Selection through Discrete Relaxation (FSDR), tailored for PTS data. Unlike the existing feature selection algorithms, FSDR learns the important features as model parameters using discrete relaxation, which refers to the process of approximating a discrete optimisation problem with a continuous one. FSDR is capable of accommodating a high number of feature dimensions, a capability beyond the reach of existing DL-based or traditional methods. Through testing on a hyperspectral dataset (i.e., a type of PTS data), our experimental results demonstrate that FSDR outperforms three commonly used feature selection algorithms, taking into account a balance among execution time, $R^2$, and $RMSE$.

FSDR: A Novel Deep Learning-based Feature Selection Algorithm for Pseudo Time-Series Data using Discrete Relaxation

TL;DR

Experimental results demonstrate that FSDR outperforms three commonly used feature selection algorithms, taking into account a balance among execution time, , and .

Abstract

Conventional feature selection algorithms applied to Pseudo Time-Series (PTS) data, which consists of observations arranged in sequential order without adhering to a conventional temporal dimension, often exhibit impractical computational complexities with high dimensional data. To address this challenge, we introduce a Deep Learning (DL)-based feature selection algorithm: Feature Selection through Discrete Relaxation (FSDR), tailored for PTS data. Unlike the existing feature selection algorithms, FSDR learns the important features as model parameters using discrete relaxation, which refers to the process of approximating a discrete optimisation problem with a continuous one. FSDR is capable of accommodating a high number of feature dimensions, a capability beyond the reach of existing DL-based or traditional methods. Through testing on a hyperspectral dataset (i.e., a type of PTS data), our experimental results demonstrate that FSDR outperforms three commonly used feature selection algorithms, taking into account a balance among execution time, , and .
Paper Structure (10 sections, 4 figures, 1 table)

This paper contains 10 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Model performance ($R^2$) with all combinations of two indices from 66 bands in the downsampled dataset. It is clear that the model performance smoothly changes with small changes in band indices.
  • Figure 2: Contrasting the high-level architecture of (b) FSDR with (a) LASSO, highlighting the major differences. LASSO trains the model with all the available features. On the other hand, FSDR takes a subset with the given target number ($t$) of features, trains the model, updates the subset with nearby features of the current features in the subset based on their impact on the model performance.
  • Figure 3: Detailed FSDR Architecture. Training starts with an initial subset $F_t$ containing indices of $t$ features, which is transformed into $s_t$ through a sequence of scale-down and sigmoid transformations. Dataset $\mathcal{D}_{N \times D}$ is transformed into $N$ continuous functions $\mathcal{D'}_{N}(.)$. For each element in $s_t$, $\mathcal{D'}_{N}(.)$ is evaluated and passed to an FC network, and $f_t$ is updated through backpropagation. The final set $F_{t'}$ is obtained from $s_t$ by rescaling, rounding each element to the nearest integer, thereby transforming the elements back to the representation of actual indices, and removing duplicates.
  • Figure 4: All the discrete features (band reflectances) for each sample are transformed into a continuous function with a domain ranging from 0 to 1 using Cubic Spline Interpolation.