Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

YiFan Zhang; Weiqi Chen; Zhaoyang Zhu; Dalin Qin; Liang Sun; Xue Wang; Qingsong Wen; Zhang Zhang; Liang Wang; Rong Jin

Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

YiFan Zhang, Weiqi Chen, Zhaoyang Zhu, Dalin Qin, Liang Sun, Xue Wang, Qingsong Wen, Zhang Zhang, Liang Wang, Rong Jin

TL;DR

A novel approach, Concept D3A, that first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption, and a data augmentation strategy introducing Gaussian noise into existing training instances that helps mitigate the data distribution gap.

Abstract

Online updating of time series forecasting models aims to tackle the challenge of concept drifting by adjusting forecasting models based on streaming data. While numerous algorithms have been developed, most of them focus on model design and updating. In practice, many of these methods struggle with continuous performance regression in the face of accumulated concept drifts over time. To address this limitation, we present a novel approach, Concept \textbf{D}rift \textbf{D}etection an\textbf{D} \textbf{A}daptation (D3A), that first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption. To best harness the utility of historical data for model adaptation, we propose a data augmentation strategy introducing Gaussian noise into existing training instances. It helps mitigate the data distribution gap, a critical factor contributing to train-test performance inconsistency. The significance of our data augmentation process is verified by our theoretical analysis. Our empirical studies across six datasets demonstrate the effectiveness of D3A in improving model adaptation capability. Notably, compared to a simple Temporal Convolutional Network (TCN) baseline, D3A reduces the average Mean Squared Error (MSE) by $43.9\%$. For the state-of-the-art (SOTA) model, the MSE is reduced by $33.3\%$.

Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

TL;DR

Abstract

. For the state-of-the-art (SOTA) model, the MSE is reduced by

Paper Structure (16 sections, 3 theorems, 31 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 3 theorems, 31 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Preliminary and Related Work
Concept Drift Detection and Adaptation
Concept drift detection
Better Adaptation by Reduced Distribution Gap
An Analysis based on Linear Model
How to Fill out the Gap?
Importance of D$^3$A
Experiments
Experimental setting
Baseline details
Online forecasting results
Online forecasting results with delayed feedback
Ablation studies and analysis
Conclusion and Limitations
...and 1 more sections

Key Result

Theorem 1

If $|\Delta(\Sigma_A)|_2 \leq 1/2$, we have $\mathrm{E}_{x\sim\mathcal{P}}\left[\left|(w_* - w_A)^{\top}x\right|^2\right] \leq 4\mathcal{L}_0|\Delta(\Sigma_A)|_2^2$, where $\mathcal{L}_0 = z_A^{\top}\Sigma^{-1}z_A$.

Figures (7)

Figure 1: Model performance comparison on various tasks.
Figure 2: Accumulated MAE curves of different online learning (OL) methods on ECL dataset, the default model for (b) is FSNet.
Figure 3: (a) Accumulated MAE curves of ETTh2 and (b) Distribution of Loss Over Time, where $l_w=100$.
Figure 4: Comparison of MSE and Inference Time among the naive method, augmented with D$^3$A, and enhanced D$^3$A$^*$
Figure 5: Visualizing the model's prediction on the ECL dataset.
...and 2 more figures

Theorems & Definitions (6)

Theorem 1
Proposition 1
Theorem 2
proof
proof
proof

Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

TL;DR

Abstract

Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (6)