A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers

Daniel Waxman; Petar M. Djurić

A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers

Daniel Waxman, Petar M. Djurić

TL;DR

The paper addresses online time-series prediction with regime switching and outliers by improving Gaussian-process-based models. It introduces Lintel, a linear-time online inference algorithm that uses Markovian Gaussian processes and Kalman filtering to provide exact predictive distributions with constant-time updates, outperforming the INTEL approach in speed while matching or improving predictive quality. Key contributions include a full state-space GP formulation, an arithmetic fusion scheme for combining experts, a principled mean-function update mechanism, and extensive experiments showing substantial speedups on synthetic and real data. The work advances real-time GP forecasting in the presence of regime changes and outliers, enabling scalable online prediction for high-throughput applications.

Abstract

Online prediction of time series under regime switching is a widely studied problem in the literature, with many celebrated approaches. Using the non-parametric flexibility of Gaussian processes, the recently proposed INTEL algorithm provides a product of experts approach to online prediction of time series under possible regime switching, including the special case of outliers. This is achieved by adaptively combining several candidate models, each reporting their predictive distribution at time $t$. However, the INTEL algorithm uses a finite context window approximation to the predictive distribution, the computation of which scales cubically with the maximum lag, or otherwise scales quartically with exact predictive distributions. We introduce LINTEL, which uses the exact filtering distribution at time $t$ with constant-time updates, making the time complexity of the streaming algorithm optimal. We additionally note that the weighting mechanism of INTEL is better suited to a mixture of experts approach, and propose a fusion policy based on arithmetic averaging for LINTEL. We show experimentally that our proposed approach is over five times faster than INTEL under reasonable settings with better quality predictions.

A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers

TL;DR

Abstract

. However, the INTEL algorithm uses a finite context window approximation to the predictive distribution, the computation of which scales cubically with the maximum lag, or otherwise scales quartically with exact predictive distributions. We introduce LINTEL, which uses the exact filtering distribution at time

with constant-time updates, making the time complexity of the streaming algorithm optimal. We additionally note that the weighting mechanism of INTEL is better suited to a mixture of experts approach, and propose a fusion policy based on arithmetic averaging for LINTEL. We show experimentally that our proposed approach is over five times faster than INTEL under reasonable settings with better quality predictions.

Paper Structure (20 sections, 17 equations, 4 figures, 1 table, 2 algorithms)

This paper contains 20 sections, 17 equations, 4 figures, 1 table, 2 algorithms.

Introduction
Gaussian Process Regression
Gaussian Process Regression with Kernels
Drawbacks of the Kernel Approach
The Intel Algorithm
Weighting and Fusing
Outlier and Change Point Detection
Initialization and Approximations
Linear-Time Gaussian Process Regression
Markovian Gaussian Processes
Inference With Kalman Filtering
The Lintel algorithm
The Lintel Algorithm
Updating the Mean Function
Related Work
...and 5 more sections

Figures (4)

Figure 1: Data used in the synthetic data with outliers experiment, with each color representing a different random seed.
Figure 2: An example of the output for the synthetic data with outliers experiment. Top: The outputs of Intel and Lintel, with reported outliers marked. Shaded regions denote two standard deviations. Bottom: The difference in predictive mean $m_n$ and the data point $y_n$.
Figure 3: An example of the output for the synthetic data with outliers and regime switching experiment. Top: The outputs of Intel and Lintel, with reported outliers marked. Shaded regions denote two standard deviations. Bottom: The difference in predictive mean $m_n$ and the data point $y_n$. The legend is the same as \ref{['fig:experiment_1_example']} and is therefore omitted.
Figure 4: Output of Intel and Lintel on the CPU utilization dataset. Top: The outputs of Intel and Lintel, with reported outliers marked. Shaded regions denote two standard deviations. Bottom: The difference in predictive mean $m_n$ and the data point $y_n$. The legend is the same as \ref{['fig:experiment_1_example']} and is therefore omitted.

A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers

TL;DR

Abstract

A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers

Authors

TL;DR

Abstract

Table of Contents

Figures (4)