Table of Contents
Fetching ...

Enriched Functional Tree-Based Classifiers: A Novel Approach Leveraging Derivatives and Geometric Features

Fabrizio Maturo, Annamaria Porreca

TL;DR

The paper tackles high-dimensional scalar-on-function classification by integrating FDA with tree-based ensembles through Enriched Functional Tree-Based Classifiers (EFTCs). EFTCs represent each functional observation with a fixed B-spline feature vector $\mathbf{S}_i$ that aggregates the original curve and enriched derivatives and geometric measures ($c_{is}, c_{is}^{(1)}, c_{is}^{(2)}, d_{is}, e_{is}, \epsilon_{is}$), enabling splits across multiple transformed views. Ensembles such as Enriched Functional Random Forests (EFRF), Enriched Functional XGBoost (EFXGB), and Enriched Functional LightGBM (EFLGBM) consistently outperform non-enriched baselines on seven real datasets and six simulated scenarios, demonstrating improved accuracy and variance reduction. The work also addresses interpretability and explainability, proposing conditional feature-importance strategies to handle correlations among enriched features and outlining directions for extending the fixed-B-spline approach to other bases and classifiers.

Abstract

The positioning of this research falls within the scalar-on-function classification literature, a field of significant interest across various domains, particularly in statistics, mathematics, and computer science. This study introduces an advanced methodology for supervised classification by integrating Functional Data Analysis (FDA) with tree-based ensemble techniques for classifying high-dimensional time series. The proposed framework, Enriched Functional Tree-Based Classifiers (EFTCs), leverages derivative and geometric features, benefiting from the diversity inherent in ensemble methods to further enhance predictive performance and reduce variance. While our approach has been tested on the enrichment of Functional Classification Trees (FCTs), Functional K-NN (FKNN), Functional Random Forest (FRF), Functional XGBoost (FXGB), and Functional LightGBM (FLGBM), it could be extended to other tree-based and non-tree-based classifiers, with appropriate considerations emerging from this investigation. Through extensive experimental evaluations on seven real-world datasets and six simulated scenarios, this proposal demonstrates fascinating improvements over traditional approaches, providing new insights into the application of FDA in complex, high-dimensional learning problems.

Enriched Functional Tree-Based Classifiers: A Novel Approach Leveraging Derivatives and Geometric Features

TL;DR

The paper tackles high-dimensional scalar-on-function classification by integrating FDA with tree-based ensembles through Enriched Functional Tree-Based Classifiers (EFTCs). EFTCs represent each functional observation with a fixed B-spline feature vector that aggregates the original curve and enriched derivatives and geometric measures (), enabling splits across multiple transformed views. Ensembles such as Enriched Functional Random Forests (EFRF), Enriched Functional XGBoost (EFXGB), and Enriched Functional LightGBM (EFLGBM) consistently outperform non-enriched baselines on seven real datasets and six simulated scenarios, demonstrating improved accuracy and variance reduction. The work also addresses interpretability and explainability, proposing conditional feature-importance strategies to handle correlations among enriched features and outlining directions for extending the fixed-B-spline approach to other bases and classifiers.

Abstract

The positioning of this research falls within the scalar-on-function classification literature, a field of significant interest across various domains, particularly in statistics, mathematics, and computer science. This study introduces an advanced methodology for supervised classification by integrating Functional Data Analysis (FDA) with tree-based ensemble techniques for classifying high-dimensional time series. The proposed framework, Enriched Functional Tree-Based Classifiers (EFTCs), leverages derivative and geometric features, benefiting from the diversity inherent in ensemble methods to further enhance predictive performance and reduce variance. While our approach has been tested on the enrichment of Functional Classification Trees (FCTs), Functional K-NN (FKNN), Functional Random Forest (FRF), Functional XGBoost (FXGB), and Functional LightGBM (FLGBM), it could be extended to other tree-based and non-tree-based classifiers, with appropriate considerations emerging from this investigation. Through extensive experimental evaluations on seven real-world datasets and six simulated scenarios, this proposal demonstrates fascinating improvements over traditional approaches, providing new insights into the application of FDA in complex, high-dimensional learning problems.
Paper Structure (21 sections, 25 equations, 8 figures, 2 tables, 1 algorithm)

This paper contains 21 sections, 25 equations, 8 figures, 2 tables, 1 algorithm.

Figures (8)

  • Figure 1: Curvature and radius of curvature and their geometrical interpretation.
  • Figure 2: Functional Data for the Training Set (Car Dataset).
  • Figure 3: Functional Data for the Test Set (Car Dataset).
  • Figure 4: Comparison of Classifier Performance on the Car Dataset. The accuracy is compared across original curves, enriched features, and classical FDA methods.
  • Figure 5: Original curve representations of the six datasets.
  • ...and 3 more figures