Table of Contents
Fetching ...

Adaptive Split Balancing for Optimal Random Forest

Yuqian Zhang, Weijie Ji, Jelena Bradic

TL;DR

The paper introduces Adaptive Split Balancing Forests (ASBF) and Local ASBFs (L-ASBF) that replace conventional random feature-based splits with permutation-based, balanced splitting rules to reduce the Jensen gap and leaf diameters. By enforcing alpha-fraction constraints and balanced splitting across directions, and by integrating honest, data-dependent impurity measures, the methods achieve minimax-optimal IMSE rates over Lipschitz and Hölder smoothness classes, with L-ASBF extending to higher-order smoothness (q>1). Theoretical results establish IMSE and uniform convergence rates, and practical benefits are demonstrated through causal inference applications (ATE via AIPW and double ML) and extensive simulations/real-data experiments showing improved accuracy over baseline random forests and local linear forests. The findings highlight that controlled reduction of auxiliary randomness, together with local polynomial fitting, yields faster convergence and more reliable inference in moderate to high dimensions, with clear implications for causal estimation and structured prediction tasks.

Abstract

In this paper, we propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method. Rather than relying on the widely-used random feature selection, we propose a permutation-based balanced splitting criterion. The adaptive split balancing forest (ASBF), achieves minimax optimality under the Lipschitz class. Its localized version, which fits local regressions at the leaf level, attains the minimax rate under the broad Hölder class $\mathcal{H}^{q,β}$ of problems for any $q\in\mathbb{N}$ and $β\in(0,1]$. We identify that over-reliance on auxiliary randomness in tree construction may compromise the approximation power of trees, leading to suboptimal results. Conversely, the proposed less random, permutation-based approach demonstrates optimality over a wide range of models. Although random forests are known to perform well empirically, their theoretical convergence rates are slow. Simplified versions that construct trees without data dependence offer faster rates but lack adaptability during tree growth. Our proposed method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data. Additionally, we establish uniform upper bounds and demonstrate that ASBF improves dimensionality dependence in average treatment effect estimation problems. Simulation studies and real-world applications demonstrate our methods' superior performance over existing random forests.

Adaptive Split Balancing for Optimal Random Forest

TL;DR

The paper introduces Adaptive Split Balancing Forests (ASBF) and Local ASBFs (L-ASBF) that replace conventional random feature-based splits with permutation-based, balanced splitting rules to reduce the Jensen gap and leaf diameters. By enforcing alpha-fraction constraints and balanced splitting across directions, and by integrating honest, data-dependent impurity measures, the methods achieve minimax-optimal IMSE rates over Lipschitz and Hölder smoothness classes, with L-ASBF extending to higher-order smoothness (q>1). Theoretical results establish IMSE and uniform convergence rates, and practical benefits are demonstrated through causal inference applications (ATE via AIPW and double ML) and extensive simulations/real-data experiments showing improved accuracy over baseline random forests and local linear forests. The findings highlight that controlled reduction of auxiliary randomness, together with local polynomial fitting, yields faster convergence and more reliable inference in moderate to high dimensions, with clear implications for causal estimation and structured prediction tasks.

Abstract

In this paper, we propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method. Rather than relying on the widely-used random feature selection, we propose a permutation-based balanced splitting criterion. The adaptive split balancing forest (ASBF), achieves minimax optimality under the Lipschitz class. Its localized version, which fits local regressions at the leaf level, attains the minimax rate under the broad Hölder class of problems for any and . We identify that over-reliance on auxiliary randomness in tree construction may compromise the approximation power of trees, leading to suboptimal results. Conversely, the proposed less random, permutation-based approach demonstrates optimality over a wide range of models. Although random forests are known to perform well empirically, their theoretical convergence rates are slow. Simplified versions that construct trees without data dependence offer faster rates but lack adaptability during tree growth. Our proposed method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data. Additionally, we establish uniform upper bounds and demonstrate that ASBF improves dimensionality dependence in average treatment effect estimation problems. Simulation studies and real-world applications demonstrate our methods' superior performance over existing random forests.
Paper Structure (22 sections, 12 theorems, 238 equations, 7 figures, 3 tables, 4 algorithms)

This paper contains 22 sections, 12 theorems, 238 equations, 7 figures, 3 tables, 4 algorithms.

Key Result

Lemma 2.1

For any $r>0$ and $\alpha\in(0,0.5]$, the leaves constructed by Algorithm alg:balance_RF satisfy

Figures (7)

  • Figure 1: IMSE for the Hölder class $\mathcal{H}^{q,\beta}$ with $s=q+\beta$ and $d\in\{2,4\}$. Here, $y$ denotes IMSE as $O_p(N^{-y})$ excluding log terms; see Table \ref{['table:rate']}. Klu, A$\&$G, D$\&$S, Biau, and Chietal refer to klusowski2021sharp, arlot2014analysis, duroux2018impact, biau2012analysis, and chi2022asymptotic, which provided rates for certain integer $s$. L-ASBF is the proposed local ASBF with $\alpha=0.5$, achieving the minimax optimal rate for any $s>0$.
  • Figure 2: Illustrations of the balanced tree-growing process with $d = 2$. The purple shading represents the splitting range that satisfies the $\alpha$-fraction constraint. The red lines indicate the chosen splits.
  • Figure 3: An illustration of splitting directions undergone by a leaf in Algorithm \ref{['alg:balance_GRF']}.
  • Figure 4: IMSE convergence rate from Theorem \ref{['thm:balance_consistency']} sections for different dimension of features: $N^{-\frac{2\log (1-\alpha)}{d\log (\alpha)+2\log (1-\alpha)}}$ (top) and $N^{-\frac{2}{d+2}}$ (bottom). Darker plot represents minimax optimal rate. The darker the cover of the overlap, the closer the rates.
  • Figure 5: Boxplots of $\log(\text{RMSE}+1)$ under Setting (a) with a varying sample size.
  • ...and 2 more figures

Theorems & Definitions (22)

  • Lemma 2.1
  • Theorem 2.2
  • Theorem 3.1
  • Lemma 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Remark 1: Technical challenges of forest-based ATE estimation
  • Lemma S.1: Theorem 7 of wager2015adaptive
  • Lemma S.2: Theorem 10 of wager2015adaptive
  • Lemma S.3: Lemma 12 of wager2015adaptive
  • ...and 12 more