Partition Trees: Conditional Density Estimation over General Outcome Spaces
Felipe Angelim, Alessandro Leite
TL;DR
Partition Trees introduce a unified, nonparametric framework for conditional density estimation over general (mixed-type) outcome spaces by modeling densities as piecewise-constant on data-adaptive partitions defined via Radon–Nikodym derivatives. The method grows trees greedily to maximize a conditional log-loss objective, with an ensemble variant called Partition Forests that averages densities to improve probabilistic predictions. The authors establish $L^1( u)$-consistency under standard growth/shrinkage conditions and finite VC assumptions, and show empirically that Partition Trees/Forests deliver competitive probabilistic performance on classification and regression benchmarks relative to CART-based trees, CADET, CDTree, and Random/XGBoost families, while demonstrating robustness to noise and feature redundancy. The work also provides an efficient, scalable algorithm for joint X- and Y-splits and releases an implementation for practical use and further research.
Abstract
We propose Partition Trees, a tree-based framework for conditional density estimation over general outcome spaces, supporting both continuous and categorical variables within a unified formulation. Our approach models conditional distributions as piecewise-constant densities on data adaptive partitions and learns trees by directly minimizing conditional negative log-likelihood. This yields a scalable, nonparametric alternative to existing probabilistic trees that does not make parametric assumptions about the target distribution. We further introduce Partition Forests, an ensemble extension obtained by averaging conditional densities. Empirically, we demonstrate improved probabilistic prediction over CART-style trees and competitive or superior performance compared to state-of-the-art probabilistic tree methods and Random Forests, along with robustness to redundant features and heteroscedastic noise.
