Table of Contents
Fetching ...

The Impact of Bootstrap Sampling Rate on Random Forest Performance in Regression Tasks

Michał Iwaniuk, Mateusz Jarosz, Bartłomiej Borycki, Bartosz Jezierski, Jan Cwalina, Stanisław Kaźmierczak, Jacek Mańdziuk

TL;DR

This work presents the first large-scale assessment of bootstrap rate ($BR$) as a hyperparameter for RF regression, showing that BR=1.0 is not universally optimal. By evaluating 39 real-world regression datasets with 16 RF configurations across BR values from $0.2$ to $5.0$ using repeated two-fold cross-validation, the authors uncover substantial performance gains from BR tuning, with BR≤1.0 optimal on 24 datasets and BR>1.0 on 15. They identify dataset characteristics—global feature–target dependence and local target variance—as predictors of the preferred BR, and they reproduce the observed bias–variance trade-off through synthetic experiments, establishing conditions under which higher or lower BR is advantageous. The findings advocate incorporating BR tuning, including BR values above $1.0$, into RF regression pipelines and AutoML frameworks to optimize predictive performance.

Abstract

Random Forests (RFs) typically train each tree on a bootstrap sample of the same size as the training set, i.e., bootstrap rate (BR) equals 1.0. We systematically examine how varying BR from 0.2 to 5.0 affects RF performance across 39 heterogeneous regression datasets and 16 RF configurations, evaluating with repeated two-fold cross-validation and mean squared error. Our results demonstrate that tuning the BR can yield significant improvements over the default: the best setup relied on BR \leq 1.0 for 24 datasets, BR > 1.0 for 15, and BR = 1.0 was optimal in 4 cases only. We establish a link between dataset characteristics and the preferred BR: datasets with strong global feature-target relationships favor higher BRs, while those with higher local target variance benefit from lower BRs. To further investigate this relationship, we conducted experiments on synthetic datasets with controlled noise levels. These experiments reproduce the observed bias-variance trade-off: in low-noise scenarios, higher BRs effectively reduce model bias, whereas in high-noise settings, lower BRs help reduce model variance. Overall, BR is an influential hyperparameter that should be tuned to optimize RF regression models.

The Impact of Bootstrap Sampling Rate on Random Forest Performance in Regression Tasks

TL;DR

This work presents the first large-scale assessment of bootstrap rate () as a hyperparameter for RF regression, showing that BR=1.0 is not universally optimal. By evaluating 39 real-world regression datasets with 16 RF configurations across BR values from to using repeated two-fold cross-validation, the authors uncover substantial performance gains from BR tuning, with BR≤1.0 optimal on 24 datasets and BR>1.0 on 15. They identify dataset characteristics—global feature–target dependence and local target variance—as predictors of the preferred BR, and they reproduce the observed bias–variance trade-off through synthetic experiments, establishing conditions under which higher or lower BR is advantageous. The findings advocate incorporating BR tuning, including BR values above , into RF regression pipelines and AutoML frameworks to optimize predictive performance.

Abstract

Random Forests (RFs) typically train each tree on a bootstrap sample of the same size as the training set, i.e., bootstrap rate (BR) equals 1.0. We systematically examine how varying BR from 0.2 to 5.0 affects RF performance across 39 heterogeneous regression datasets and 16 RF configurations, evaluating with repeated two-fold cross-validation and mean squared error. Our results demonstrate that tuning the BR can yield significant improvements over the default: the best setup relied on BR \leq 1.0 for 24 datasets, BR > 1.0 for 15, and BR = 1.0 was optimal in 4 cases only. We establish a link between dataset characteristics and the preferred BR: datasets with strong global feature-target relationships favor higher BRs, while those with higher local target variance benefit from lower BRs. To further investigate this relationship, we conducted experiments on synthetic datasets with controlled noise levels. These experiments reproduce the observed bias-variance trade-off: in low-noise scenarios, higher BRs effectively reduce model bias, whereas in high-noise settings, lower BRs help reduce model variance. Overall, BR is an influential hyperparameter that should be tuned to optimize RF regression models.

Paper Structure

This paper contains 20 sections, 21 equations, 6 figures, 81 tables.

Figures (6)

  • Figure 1: Expected fraction of original instances that appear at least once in a bootstrap sample of size $BR \cdot N$, drawn with replacement from a dataset of size $N$. The orange line indicates the classical bootstrap setting with $BR~=~1.0$, where approximately 63.2% of the original instances are expected to be included. The curve follows the approximation $\mathbb{E}(U)/N \approx 1 - e^{-BR}$, which becomes asymptotically exact as $N \to \infty$ with fixed BR.
  • Figure 2: BR curve profiles for selected representative datasets. For each dataset, the optimal configuration is marked with a red star. The rest of the plots for other datasets are provided in \ref{['sec:br_curves']}.
  • Figure 3: Distribution of optimal BR values in the best configuration (top left) and across individual RF setups.
  • Figure 4: Test MSE of RFs trained on synthetic datasets under varying noise levels ($\sigma$) and BRs. Each curve represents a different noise setting. The results show that higher BRs perform better when noise is low, while lower BRs are better under high noise. This illustrates the trade-off between resampling intensity and noise sensitivity in model performance.
  • Figure 5: MSE across various BRs $b$ and noise levels $\sigma$ in pure-noise scenario.
  • ...and 1 more figures