Boosting-Based Sequential Meta-Tree Ensemble Construction for Improved Decision Trees
Ryota Maniwa, Naoki Ichijo, Yuta Nakahara, Toshiyasu Matsushima
TL;DR
This work tackles overfitting in decision trees by adopting meta-trees within a Bayesian framework to obtain Bayes-optimal predictions. It extends to boosting ensembles of meta-trees, constructing them sequentially by minimizing residuals and leveraging either GBDT-like residual learning or posterior-based weighting over meta-trees, with shrinkage as a regularizer. The authors introduce several variants, including MT_gbdt, MT_uni-uni, MT_uni-pos, and MT_pos-pos, and demonstrate through synthetic and benchmark experiments that ensembles of meta-trees can achieve lower Bayes risk and better generalization than traditional ensembles like GBDT and LightGBM, particularly when allowing deeper trees. The practical impact is improved predictive performance and robustness to overfitting in tree-based models, with a flexible framework for weighting meta-trees via uniform or posterior distributions over the explanatory features and their thresholds.
Abstract
A decision tree is one of the most popular approaches in machine learning fields. However, it suffers from the problem of overfitting caused by overly deepened trees. Then, a meta-tree is recently proposed. It solves the problem of overfitting caused by overly deepened trees. Moreover, the meta-tree guarantees statistical optimality based on Bayes decision theory. Therefore, the meta-tree is expected to perform better than the decision tree. In contrast to a single decision tree, it is known that ensembles of decision trees, which are typically constructed boosting algorithms, are more effective in improving predictive performance. Thus, it is expected that ensembles of meta-trees are more effective in improving predictive performance than a single meta-tree, and there are no previous studies that construct multiple meta-trees in boosting. Therefore, in this study, we propose a method to construct multiple meta-trees using a boosting approach. Through experiments with synthetic and benchmark datasets, we conduct a performance comparison between the proposed methods and the conventional methods using ensembles of decision trees. Furthermore, while ensembles of decision trees can cause overfitting as well as a single decision tree, experiments confirmed that ensembles of meta-trees can prevent overfitting due to the tree depth.
