A new computationally efficient algorithm to solve Feature Selection for Functional Data Classification in high-dimensional spaces

Tobia Boschi; Francesca Bonin; Rodrigo Ordonez-Hurtado; Alessandra Pascale; Jonathan Epperlein

A new computationally efficient algorithm to solve Feature Selection for Functional Data Classification in high-dimensional spaces

Tobia Boschi, Francesca Bonin, Rodrigo Ordonez-Hurtado, Alessandra Pascale, Jonathan Epperlein

TL;DR

A novel methodology for Feature Selection for Functional Classification, FSFC, that addresses the challenge of jointly performing feature selection and classification of functional data in scenarios with categorical responses and multivariate longitudinal features and can be leveraged to significantly reduce the problem's dimensionality and enhance the performances of other classification algorithms.

Abstract

This paper introduces a novel methodology for Feature Selection for Functional Classification, FSFC, that addresses the challenge of jointly performing feature selection and classification of functional data in scenarios with categorical responses and multivariate longitudinal features. FSFC tackles a newly defined optimization problem that integrates logistic loss and functional features to identify the most crucial variables for classification. To address the minimization procedure, we employ functional principal components and develop a new adaptive version of the Dual Augmented Lagrangian algorithm. The computational efficiency of FSFC enables handling high-dimensional scenarios where the number of features may considerably exceed the number of statistical units. Simulation experiments demonstrate that FSFC outperforms other machine learning and deep learning methods in computational time and classification accuracy. Furthermore, the FSFC feature selection capability can be leveraged to significantly reduce the problem's dimensionality and enhance the performances of other classification algorithms. The efficacy of FSFC is also demonstrated through a real data application, analyzing relationships between four chronic diseases and other health and demographic factors.

A new computationally efficient algorithm to solve Feature Selection for Functional Data Classification in high-dimensional spaces

TL;DR

Abstract

Paper Structure (27 sections, 2 theorems, 33 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 27 sections, 2 theorems, 33 equations, 3 figures, 4 tables, 1 algorithm.

Introduction
Related Work.
Methods
Problem definition
Matrix representation
Selection of K.
Dual Augmented Lagrangian (DAL) Algorithm
Dual problem.
Update of Z.
Update of V.
Computational efficiency.
Convergence criteria.
Model selection and adaptive implementation
Simulation study
Settings.
...and 12 more sections

Key Result

Proposition 2.1

Considering $h$ as in Equation eq:primal, then the function $h^*$ is defined for $\left|Y_i V_i\right| < 1$ as follows:

Figures (3)

Figure 1: Simulation results. Boxplots generated from the distribution obtained across 50 replications of each scenario, with gray diamonds and horizontal lines indicating means and medians of the distributions, respectively. Selection performances (precision and recall) are computed just for FSFC, while classification accuracy in the training/test set is reported for all the examined algorithms (LSTM, SVM, r-LSTM, r-SVM, FSFC). The rows illustrate two distinct scenarios ($n=300$, $p=800$, and $n=600$, $p=2000$). In each scenario, we investigate $p_0 = 2, 5, 10, 20$ (x-axes).
Figure 2: Experiment 1 (upper panel) and Experiment 2 (lower panel) SHARE results. The test set classification accuracy boxplots (on the left) are generated from 100 replications. The dots and the horizontal lines indicate the means and medians of the distributions, respectively. On the right, features selected by FSFC for more than 80 out of 100 replications are displayed for each response variable. The bar plots illustrate the average ratio of $\lambda_{max}$ at which each feature entered the active set. The higher the ratio, the earlier the feature is included in the model during the $\lambda$ path search.
Figure 51: On the left, a sample of 50 curves from the first feature of the design matrix $\mathcal{X}$ (top) and the 10 non-zero $\mathcal{B}$ coefficients (bottom) are displayed for the given scenario with $n=300$, $p=800$, $p_0=10$. On the right, the SHARE project timeline is depicted, which has been sourced from the SHARE website: https://share-eric.eu/data/data-documentation/waves-overview

Theorems & Definitions (2)

Proposition 2.1
Proposition 2.2

A new computationally efficient algorithm to solve Feature Selection for Functional Data Classification in high-dimensional spaces

TL;DR

Abstract

A new computationally efficient algorithm to solve Feature Selection for Functional Data Classification in high-dimensional spaces

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (2)