Table of Contents
Fetching ...

Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

YiFan Zhang, Weiqi Chen, Zhaoyang Zhu, Dalin Qin, Liang Sun, Xue Wang, Qingsong Wen, Zhang Zhang, Liang Wang, Rong Jin

TL;DR

A novel approach, Concept D3A, that first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption, and a data augmentation strategy introducing Gaussian noise into existing training instances that helps mitigate the data distribution gap.

Abstract

Online updating of time series forecasting models aims to tackle the challenge of concept drifting by adjusting forecasting models based on streaming data. While numerous algorithms have been developed, most of them focus on model design and updating. In practice, many of these methods struggle with continuous performance regression in the face of accumulated concept drifts over time. To address this limitation, we present a novel approach, Concept \textbf{D}rift \textbf{D}etection an\textbf{D} \textbf{A}daptation (D3A), that first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption. To best harness the utility of historical data for model adaptation, we propose a data augmentation strategy introducing Gaussian noise into existing training instances. It helps mitigate the data distribution gap, a critical factor contributing to train-test performance inconsistency. The significance of our data augmentation process is verified by our theoretical analysis. Our empirical studies across six datasets demonstrate the effectiveness of D3A in improving model adaptation capability. Notably, compared to a simple Temporal Convolutional Network (TCN) baseline, D3A reduces the average Mean Squared Error (MSE) by $43.9\%$. For the state-of-the-art (SOTA) model, the MSE is reduced by $33.3\%$.

Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt

TL;DR

A novel approach, Concept D3A, that first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption, and a data augmentation strategy introducing Gaussian noise into existing training instances that helps mitigate the data distribution gap.

Abstract

Online updating of time series forecasting models aims to tackle the challenge of concept drifting by adjusting forecasting models based on streaming data. While numerous algorithms have been developed, most of them focus on model design and updating. In practice, many of these methods struggle with continuous performance regression in the face of accumulated concept drifts over time. To address this limitation, we present a novel approach, Concept \textbf{D}rift \textbf{D}etection an\textbf{D} \textbf{A}daptation (D3A), that first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption. To best harness the utility of historical data for model adaptation, we propose a data augmentation strategy introducing Gaussian noise into existing training instances. It helps mitigate the data distribution gap, a critical factor contributing to train-test performance inconsistency. The significance of our data augmentation process is verified by our theoretical analysis. Our empirical studies across six datasets demonstrate the effectiveness of D3A in improving model adaptation capability. Notably, compared to a simple Temporal Convolutional Network (TCN) baseline, D3A reduces the average Mean Squared Error (MSE) by . For the state-of-the-art (SOTA) model, the MSE is reduced by .
Paper Structure (16 sections, 3 theorems, 31 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 3 theorems, 31 equations, 7 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

If $|\Delta(\Sigma_A)|_2 \leq 1/2$, we have $\mathrm{E}_{x\sim\mathcal{P}}\left[\left|(w_* - w_A)^{\top}x\right|^2\right] \leq 4\mathcal{L}_0|\Delta(\Sigma_A)|_2^2$, where $\mathcal{L}_0 = z_A^{\top}\Sigma^{-1}z_A$.

Figures (7)

  • Figure 1: Model performance comparison on various tasks.
  • Figure 2: Accumulated MAE curves of different online learning (OL) methods on ECL dataset, the default model for (b) is FSNet.
  • Figure 3: (a) Accumulated MAE curves of ETTh2 and (b) Distribution of Loss Over Time, where $l_w=100$.
  • Figure 4: Comparison of MSE and Inference Time among the naive method, augmented with D$^3$A, and enhanced D$^3$A$^*$
  • Figure 5: Visualizing the model's prediction on the ECL dataset.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Proposition 1
  • Theorem 2
  • proof
  • proof
  • proof