FoLDTree: A ULDA-Based Decision Tree Framework for Efficient Oblique Splits and Feature Selection
Siyu Wang, Kehui Yao
TL;DR
This work introduces LDATree and FoLDTree, two ULDA-based decision-tree frameworks designed to implement efficient oblique splits while robustly handling missing values and enabling feature selection. By embedding ULDA and Forward ULDA within a recursive tree structure, the methods produce multi-class, probability-enabled splits with strong predictive performance, approaching that of random forests on many datasets. Empirical results across simulations and real-world data demonstrate improved robustness to noise, capability to capture high-order interactions, and competitive accuracy relative to established oblique trees and orthogonal methods. The approaches offer practical benefits as robust single-tree alternatives and lay groundwork for future ensembles and SVM-inspired enhancements.
Abstract
Traditional decision trees are limited by axis-orthogonal splits, which can perform poorly when true decision boundaries are oblique. While oblique decision tree methods address this limitation, they often face high computational costs, difficulties with multi-class classification, and a lack of effective feature selection. In this paper, we introduce LDATree and FoLDTree, two novel frameworks that integrate Uncorrelated Linear Discriminant Analysis (ULDA) and Forward ULDA into a decision tree structure. These methods enable efficient oblique splits, handle missing values, support feature selection, and provide both class labels and probabilities as model outputs. Through evaluations on simulated and real-world datasets, LDATree and FoLDTree consistently outperform axis-orthogonal and other oblique decision tree methods, achieving accuracy levels comparable to the random forest. The results highlight the potential of these frameworks as robust alternatives to traditional single-tree methods.
