Robust Predictions with Ambiguous Time Delays: A Bootstrap Strategy
Jiajie Wang, Zhiyuan Jerry Lin, Wen Chen
TL;DR
The paper tackles predictive modeling for multivariate time series with non-deterministic time delays by introducing Time Series Model Bootstrap (TSMB). By treating delays as a random variable and drawing bootstrap samples to infer delay realizations $\hat{\bm{\delta}}^b$, TSMB trains an ensemble of predictors $f_{\hat{\bm{\delta}}^b}$ and aggregates their outputs to approximate $E[Y|X]$, providing a model-agnostic, uncertainty-aware framework. Across nine real-world datasets and with base models including GBDT and TFT, TSMB consistently outperforms traditional time delay estimation baselines (TDMI, GCC) in predictive accuracy and demonstrates meaningful insights into delay distributions and coverage. The work highlights both the practical benefits of accommodating delay uncertainty and the challenges of calibration for prediction intervals, offering a scalable, extensible approach for robust forecasting under delay variability.
Abstract
In contemporary data-driven environments, the generation and processing of multivariate time series data is an omnipresent challenge, often complicated by time delays between different time series. These delays, originating from a multitude of sources like varying data transmission dynamics, sensor interferences, and environmental changes, introduce significant complexities. Traditional Time Delay Estimation methods, which typically assume a fixed constant time delay, may not fully capture these variabilities, compromising the precision of predictive models in diverse settings. To address this issue, we introduce the Time Series Model Bootstrap (TSMB), a versatile framework designed to handle potentially varying or even nondeterministic time delays in time series modeling. Contrary to traditional approaches that hinge on the assumption of a single, consistent time delay, TSMB adopts a nonparametric stance, acknowledging and incorporating time delay uncertainties. TSMB significantly bolsters the performance of models that are trained and make predictions using this framework, making it highly suitable for a wide range of dynamic and interconnected data environments.
