A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers
Daniel Waxman, Petar M. Djurić
TL;DR
The paper addresses online time-series prediction with regime switching and outliers by improving Gaussian-process-based models. It introduces Lintel, a linear-time online inference algorithm that uses Markovian Gaussian processes and Kalman filtering to provide exact predictive distributions with constant-time updates, outperforming the INTEL approach in speed while matching or improving predictive quality. Key contributions include a full state-space GP formulation, an arithmetic fusion scheme for combining experts, a principled mean-function update mechanism, and extensive experiments showing substantial speedups on synthetic and real data. The work advances real-time GP forecasting in the presence of regime changes and outliers, enabling scalable online prediction for high-throughput applications.
Abstract
Online prediction of time series under regime switching is a widely studied problem in the literature, with many celebrated approaches. Using the non-parametric flexibility of Gaussian processes, the recently proposed INTEL algorithm provides a product of experts approach to online prediction of time series under possible regime switching, including the special case of outliers. This is achieved by adaptively combining several candidate models, each reporting their predictive distribution at time $t$. However, the INTEL algorithm uses a finite context window approximation to the predictive distribution, the computation of which scales cubically with the maximum lag, or otherwise scales quartically with exact predictive distributions. We introduce LINTEL, which uses the exact filtering distribution at time $t$ with constant-time updates, making the time complexity of the streaming algorithm optimal. We additionally note that the weighting mechanism of INTEL is better suited to a mixture of experts approach, and propose a fusion policy based on arithmetic averaging for LINTEL. We show experimentally that our proposed approach is over five times faster than INTEL under reasonable settings with better quality predictions.
