A Simple and Effective Random Forest Modelling for Nonlinear Time Series Data
Shihao Zhang, Zudi Lu, Chao Zheng
TL;DR
The paper tackles nonlinear time-series forecasting where temporal dependence complicates traditional random forests. It introduces RF-RW, a random-weight bootstrap-free forest that preserves serial dependence while injecting tree-level diversity through independent weights applied to the training series. The authors prove non-asymptotic concentration bounds and asymptotic uniform consistency for both fixed and high-dimensional feature spaces, supported by extensive simulations and an empirical UK COVID-19 daily cases application showing superior predictive accuracy. The work provides a practical, theoretically grounded approach for nonlinear time series and opens avenues for spatio-temporal extensions and quantile estimation.
Abstract
In this paper, we propose Random Forests by Random Weights (RF-RW), a theoretically grounded and practically effective alternative RF modelling for nonlinear time series data, where existing RF-based approaches struggle to adequately capture temporal dependence. RF-RW reconciles the strengths of classic RF with the temporal dependence inherent in time series forecasting. Specifically, it avoids the bootstrap resampling procedure, therefore preserves the serial dependence structure, whilst incorporates independent random weights to reduce correlations among trees. We establish non-asymptotic concentration bounds and asymptotic uniform consistency guarantees, for both fixed- and high-dimensional feature spaces, which extend beyond existing theoretical analyses of RF. Extensive simulation studies demonstrate that RF-RW outperforms existing RF-based approaches and other benchmarks such as SVM and LSTM. It also achieves the lowest error among competitors in our real-data example of predicting UK COVID-19 daily cases.
