Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

Nikola Surjanovic; Andrew Henrey; Thomas M. Loughin

Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

Nikola Surjanovic, Andrew Henrey, Thomas M. Loughin

TL;DR

The paper tackles the bias-variance tension in random forests by proposing alpha-trimmed RFs that prune regression trees adaptively according to local signal-to-noise ratio. It develops Accumulated Information Pruning (AIP) to perform a bottom-up, information-criterion–driven pruning of each tree, extended to ensembles through a tunable parameter α that governs pruning strength without re-fitting. The information criteria for pruning are formalized via a modified Bayesian information criterion with penalties $P_{0,n}$ and $P_{1,n}$, and are shown to be statistically consistent with respect to model selection between tree-root and tree-stump configurations; pruning amount is related to the data’s SNR and remains computationally efficient with complexity $O(Bn)$ for fixed α. Empirically, the alpha-trimmed RFs frequently reduce mean squared prediction error on 46 data sets compared with standard RFs and perform competitively with, or better than, RFs tuned by global node-size adjustments, while avoiding cross-validation. The method offers a practical, parallelizable, and refitting-free approach to locally adaptive tree sizing in RFs, improving predictive performance in regions of varying SNR.

Abstract

We demonstrate that adaptively controlling the size of individual regression trees in a random forest can improve predictive performance, contrary to the conventional wisdom that trees should be fully grown. A fast pruning algorithm, alpha-trimming, is proposed as an effective approach to pruning trees within a random forest, where more aggressive pruning is performed in regions with a low signal-to-noise ratio. The amount of overall pruning is controlled by adjusting the weight on an information criterion penalty as a tuning parameter, with the standard random forest being a special case of our alpha-trimmed random forest. A remarkable feature of alpha-trimming is that its tuning parameter can be adjusted without refitting the trees in the random forest once the trees have been fully grown once. In a benchmark suite of 46 example data sets, mean squared prediction error is often substantially lowered by using our pruning algorithm and is never substantially increased compared to a random forest with fully-grown trees at default parameter settings.

Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

TL;DR

and

, and are shown to be statistically consistent with respect to model selection between tree-root and tree-stump configurations; pruning amount is related to the data’s SNR and remains computationally efficient with complexity

for fixed α. Empirically, the alpha-trimmed RFs frequently reduce mean squared prediction error on 46 data sets compared with standard RFs and perform competitively with, or better than, RFs tuned by global node-size adjustments, while avoiding cross-validation. The method offers a practical, parallelizable, and refitting-free approach to locally adaptive tree sizing in RFs, improving predictive performance in regions of varying SNR.

Abstract

Paper Structure (19 sections, 4 theorems, 43 equations, 5 figures, 1 table, 2 algorithms)

This paper contains 19 sections, 4 theorems, 43 equations, 5 figures, 1 table, 2 algorithms.

Introduction
Preliminaries and notation
Alpha-trimmed random forests
Accumulated information pruning algorithm
Tree roots and stumps
Climbing up the tree
Alpha-trimming
Proposed information criterion
Theoretical results
Previous work
Simulation study design
SNR experiments
Constant SNR
Mixed low and high SNR
Data sets and methods
...and 4 more sections

Key Result

Proposition 3.1

Suppose that $Z_i = (X_i, Y_i)$ are i.i.d. with the distribution of each $X_i$ supported on $[0,1]^d$ with a strictly positive density with respect to Lebesgue measure on $[0,1]^d$. Further, suppose that $Y_i \mid X_i = x_i \sim \mathcal{N}(\mu, \sigma^2)$ for some $\mu \in \mathbb{R}$ and $\sigma^2

Figures (5)

Figure 1: A logistic regression curve (red) with estimated regression curves (blue). Left: predictions from a random forest with tuned global node size. Right: predictions from our proposed alpha-trimmed random forest with local node size selection. The alpha-trimmed random forest approximates the true regression curve well in regions with both a high and low signal-to-noise ratio.
Figure 2: Regression tree notation. The shaded region in each panel is $N_0$, $N_1$, and $N_2$ from left to right. The newly added line segments are $s_1$, $s_2$, and $s_3$.
Figure 3: Top left to bottom right: Predicted values from RF-5, RF-25, RF-500, and AlphaTrim applied to a modified version of the elbow data.
Figure 4: Comparison of alpha-trimmed RFs and default RFs on 46 data sets with approximate 95% $z$-based confidence intervals of ratios of RMSPEs. Blue, orange, and red bars indicate cases where the alpha-trimmed RF performed better than, similar to, or worse than the default RF.
Figure 5: Comparison of alpha-trimmed RFs and tuned RFs on 46 data sets.

Theorems & Definitions (4)

Proposition 3.1
Proposition 3.2
Proposition 3.3
Proposition 3.4

Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

TL;DR

Abstract

Alpha-Trimming: Locally Adaptive Tree Pruning for Random Forests

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (4)