Consistency of Oblique Decision Tree and its Boosting and Random Forest
Haoran Zhan, Yu Liu, Yingcun Xia
TL;DR
The paper establishes rigorous consistency results for Oblique Decision Trees (ODT) and their ensembles, showing that ODT is consistent for general $L^2$-integrable regression functions, and that ODT-based Random Forests (ODRF) retain consistency under both partially and fully grown-tree regimes. It extends to an ensemble of gradient boosting trees (ODBT) built via orthogonal matching pursuit, proving consistency and, under certain function classes, fast rates. The authors also introduce two feature-bagging schemes, refine ODRF implementations, and provide extensive real-data experiments demonstrating improvements over standard RF and related forests. Collectively, the work unifies oblique partitions with ensemble methods under solid theoretical guarantees and practical gains, bridging tree-based methods with neural-network-like representations to achieve strong performance.
Abstract
Classification and Regression Tree (CART), Random Forest (RF) and Gradient Boosting Tree (GBT) are probably the most popular set of statistical learning methods. However, their statistical consistency can only be proved under very restrictive assumptions on the underlying regression function. As an extension to standard CART, the oblique decision tree (ODT), which uses linear combinations of predictors as partitioning variables, has received much attention. ODT tends to perform numerically better than CART and requires fewer partitions. In this paper, we show that ODT is consistent for very general regression functions as long as they are $L^2$ integrable. Then, we prove the consistency of the ODT-based random forest (ODRF), whether fully grown or not. Finally, we propose an ensemble of GBT for regression by borrowing the technique of orthogonal matching pursuit and study its consistency under very mild conditions on the tree structure. After refining existing computer packages according to the established theory, extensive experiments on real data sets show that both our ensemble boosting trees and ODRF have noticeable overall improvements over RF and other forests.
