Ensembles of Probabilistic Regression Trees
Alexandre Seiller, Éric Gaussier, Emilie Devijver, Marianne Clausel, Sami Alkhoury
TL;DR
This work develops and analyzes ensembles of probabilistic regression trees (PR trees), including PR-RF, PR-GBT, and P-BART, and proves their consistency under standard assumptions. By blending soft region memberships with ensemble methods, it demonstrates favorable bias-variance properties and competitive predictive accuracy against strong regressors such as RF, GBT, and BART across diverse datasets. Theoretical results establish consistency and posterior concentration for the Bayesian variant, while extensive experiments illustrate practical performance and trade-offs, highlighting that no single method dominates and choices depend on the bias-variance emphasis. The study advances a principled, smooth alternative to traditional tree ensembles with potential for uncertainty quantification via Bayesian extensions and quantile regression.
Abstract
Tree-based ensemble methods such as random forests, gradient-boosted trees, and Bayesianadditive regression trees have been successfully used for regression problems in many applicationsand research studies. In this paper, we study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution. We prove thatthe ensemble versions of probabilistic regression trees considered are consistent, and experimentallystudy their bias-variance trade-off and compare them with the state-of-the-art interms of performance prediction.
