Table of Contents
Fetching ...

Augmented Functional Random Forests: Classifier Construction and Unbiased Functional Principal Components Importance through Ad-Hoc Conditional Permutations

Fabrizio Maturo, Annamaria Porreca

TL;DR

This work tackles high-dimensional functional data classification by fusing Functional Data Analysis with tree-based methods. It introduces Augmented Functional Classification Trees (AFCTs) and Augmented Functional Random Forests (AFRFs) that exploit derivative-based augmented features and functional principal components, along with CPIAFPCs for unbiased feature importance in the presence of correlated derivatives. Empirical results on ECG200 and six simulated scenarios show AFRF and AFCTs often outperform baseline functional forests, with CPIAFPCs providing robust interpretability of feature contributions. The approach promises improved predictive accuracy and insights into the functional structure of data, with potential extensions to other bases and broader explainability tools.

Abstract

This paper introduces a novel supervised classification strategy that integrates functional data analysis (FDA) with tree-based methods, addressing the challenges of high-dimensional data and enhancing the classification performance of existing functional classifiers. Specifically, we propose augmented versions of functional classification trees and functional random forests, incorporating a new tool for assessing the importance of functional principal components. This tool provides an ad-hoc method for determining unbiased permutation feature importance in functional data, particularly when dealing with correlated features derived from successive derivatives. Our study demonstrates that these additional features can significantly enhance the predictive power of functional classifiers. Experimental evaluations on both real-world and simulated datasets showcase the effectiveness of the proposed methodology, yielding promising results compared to existing methods.

Augmented Functional Random Forests: Classifier Construction and Unbiased Functional Principal Components Importance through Ad-Hoc Conditional Permutations

TL;DR

This work tackles high-dimensional functional data classification by fusing Functional Data Analysis with tree-based methods. It introduces Augmented Functional Classification Trees (AFCTs) and Augmented Functional Random Forests (AFRFs) that exploit derivative-based augmented features and functional principal components, along with CPIAFPCs for unbiased feature importance in the presence of correlated derivatives. Empirical results on ECG200 and six simulated scenarios show AFRF and AFCTs often outperform baseline functional forests, with CPIAFPCs providing robust interpretability of feature contributions. The approach promises improved predictive accuracy and insights into the functional structure of data, with potential extensions to other bases and broader explainability tools.

Abstract

This paper introduces a novel supervised classification strategy that integrates functional data analysis (FDA) with tree-based methods, addressing the challenges of high-dimensional data and enhancing the classification performance of existing functional classifiers. Specifically, we propose augmented versions of functional classification trees and functional random forests, incorporating a new tool for assessing the importance of functional principal components. This tool provides an ad-hoc method for determining unbiased permutation feature importance in functional data, particularly when dealing with correlated features derived from successive derivatives. Our study demonstrates that these additional features can significantly enhance the predictive power of functional classifiers. Experimental evaluations on both real-world and simulated datasets showcase the effectiveness of the proposed methodology, yielding promising results compared to existing methods.
Paper Structure (12 sections, 16 equations, 20 figures, 2 tables, 1 algorithm)

This paper contains 12 sections, 16 equations, 20 figures, 2 tables, 1 algorithm.

Figures (20)

  • Figure 1: ECGs in the training set.
  • Figure 2: ECGs in the test set.
  • Figure 3: Smoothed ECGs in the training set.
  • Figure 4: Smoothed ECGs in the test set.
  • Figure 5: First Ten Functional Principal Components of the original curves.
  • ...and 15 more figures