Table of Contents
Fetching ...

Robust Time Series Forecasting with Non-Heavy-Tailed Gaussian Loss-Weighted Sampler

Jiang You, Arben Cela, René Natowicz, Jacob Ouanounou, Patrick Siarry

TL;DR

A Gaussian loss-weighted sampler that multiplies their running losses with a Gaussian distribution weight that relieves the inefficiency in learning redundant easy samples and overfitting to outliers, and improves training efficiency by preferentially learning samples close to the average loss.

Abstract

Forecasting multivariate time series is a computationally intensive task challenged by extreme or redundant samples. Recent resampling methods aim to increase training efficiency by reweighting samples based on their running losses. However, these methods do not solve the problems caused by heavy-tailed distribution losses, such as overfitting to outliers. To tackle these issues, we introduce a novel approach: a Gaussian loss-weighted sampler that multiplies their running losses with a Gaussian distribution weight. It reduces the probability of selecting samples with very low or very high losses while favoring those close to average losses. As it creates a weighted loss distribution that is not heavy-tailed theoretically, there are several advantages to highlight compared to existing methods: 1) it relieves the inefficiency in learning redundant easy samples and overfitting to outliers, 2) It improves training efficiency by preferentially learning samples close to the average loss. Application on real-world time series forecasting datasets demonstrate improvements in prediction quality for 1%-4% using mean square error measurements in channel-independent settings. The code will be available online after 1 the review.

Robust Time Series Forecasting with Non-Heavy-Tailed Gaussian Loss-Weighted Sampler

TL;DR

A Gaussian loss-weighted sampler that multiplies their running losses with a Gaussian distribution weight that relieves the inefficiency in learning redundant easy samples and overfitting to outliers, and improves training efficiency by preferentially learning samples close to the average loss.

Abstract

Forecasting multivariate time series is a computationally intensive task challenged by extreme or redundant samples. Recent resampling methods aim to increase training efficiency by reweighting samples based on their running losses. However, these methods do not solve the problems caused by heavy-tailed distribution losses, such as overfitting to outliers. To tackle these issues, we introduce a novel approach: a Gaussian loss-weighted sampler that multiplies their running losses with a Gaussian distribution weight. It reduces the probability of selecting samples with very low or very high losses while favoring those close to average losses. As it creates a weighted loss distribution that is not heavy-tailed theoretically, there are several advantages to highlight compared to existing methods: 1) it relieves the inefficiency in learning redundant easy samples and overfitting to outliers, 2) It improves training efficiency by preferentially learning samples close to the average loss. Application on real-world time series forecasting datasets demonstrate improvements in prediction quality for 1%-4% using mean square error measurements in channel-independent settings. The code will be available online after 1 the review.
Paper Structure (12 sections, 2 theorems, 39 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 2 theorems, 39 equations, 4 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

Given a loss $x$ whose density function $f(x)$ is lipschitzien continue, there is a constant $C$ such that Gaussian loss-weighted resampling loss forms a distribution whose density function is $g(x)f(x)$ and

Figures (4)

  • Figure 1: (a) Synthetic dataset for the binary classification task. It contains 1000 samples of 2 classes following Gaussian distribution with standard deviation $\sigma=0.3$ and centered differently (b) The surface is the contour loss (The minimum distance between the predicted value and 0,1) and the dot color represents point loss. (c) The surface is Gaussian weight($\mu$ = 0.0, $\sigma$ = 1.0) computed with the contour loss. It focuses on samples around the decision boundary but reduces the frequency of learning boundary (hard) samples or samples far away from the boundary (easy). Therefore, it is less affected by outliers and redundant samples.
  • Figure 2: Example of heavy-tailed true loss and reweighted loss from application of NLinear Method on ETTm2 dataset. The reweighted loss is more uniform.
  • Figure 3: Running test MSE of Gaussian Method and Infobatch with pruning ratio $\sigma$=$0.3$ and baseline without resampling.
  • Figure 4: Grid search of $\mu$ and $\sigma$ on ETT Dataset for forecasting task with lookback window $L=720$ and prediction horizon $T=720$

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Proposition 1
  • Theorem 1