Asymptotic confidence bands for centered purely random forests
Natalie Neumeyer, Jan Rabe, Mathias Trabs
TL;DR
This paper develops asymptotic uniform confidence bands for regression functions estimated by centered purely random forests in a multivariate nonparametric setting. By interpreting CPRFs as generalized U-statistics and leveraging Gaussian-approximation techniques for the supremum of empirical processes, it derives a nonparametric confidence band around the CPRF estimator that adapts to local variance via $\Psi_k(x)$ and does not rely on a limit distribution. The authors introduce the Ehrenfest centered CPRF to achieve minimax optimal rates and provide a pointwise CLT as well as a uniform convergence framework, culminating in a practical methodology for honest confidence bands in multivariate settings. Simulation studies illustrate finite-sample performance, showing favorable coverage and band radii relative to histogram-based methods, with band width adapting to local feature density.
Abstract
In a multivariate nonparametric regression setting we construct explicit asymptotic uniform confidence bands for centered purely random forests. Since the most popular example in this class of random forests, namely the uniformly centered purely random forests, is well known to suffer from suboptimal rates, we propose a new type of purely random forests, called the Ehrenfest centered purely random forests, which achieve minimax optimal rates. Our main confidence band theorem applies to both random forests. The proof is based on an interpretation of random forests as generalized U-Statistics together with a Gaussian approximation of the supremum of empirical processes. Our theoretical findings are illustrated in simulation examples.
