ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning

Jimeng Shi; Azam Shirali; Giri Narasimhan

ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning

Jimeng Shi, Azam Shirali, Giri Narasimhan

TL;DR

Meta-learning-based reweighting outperforms existing heuristic ones, and the fine-tuning strategy can further increase the model performance, which can be implemented on any type of neural network for time series forecasting.

Abstract

Extreme events are of great importance since they often represent impactive occurrences. For instance, in terms of climate and weather, extreme events might be major storms, floods, extreme heat or cold waves, and more. However, they are often located at the tail of the data distribution. Consequently, accurately predicting these extreme events is challenging due to their rarity and irregularity. Prior studies have also referred to this as the out-of-distribution (OOD) problem, which occurs when the distribution of the test data is substantially different from that used for training. In this work, we propose two strategies, reweighting and fine-tuning, to tackle the challenge. Reweighting is a strategy used to force machine learning models to focus on extreme events, which is achieved by a weighted loss function that assigns greater penalties to the prediction errors for the extreme samples relative to those on the remainder of the data. Unlike previous intuitive reweighting methods based on simple heuristics of data distribution, we employ meta-learning to dynamically optimize these penalty weights. To further boost the performance on extreme samples, we start from the reweighted models and fine-tune them using only rare extreme samples. Through extensive experiments on multiple data sets, we empirically validate that our meta-learning-based reweighting outperforms existing heuristic ones, and the fine-tuning strategy can further increase the model performance. More importantly, these two strategies are model-agnostic, which can be implemented on any type of neural network for time series forecasting. The open-sourced code is available at \url{https://github.com/JimengShi/ReFine}.

ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning

TL;DR

Abstract

Paper Structure (24 sections, 1 theorem, 19 equations, 8 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 1 theorem, 19 equations, 8 figures, 4 tables, 1 algorithm.

Introduction
Related work
Time Series Prediction
Reweighting
Fine-tuning
Problem Formulation
Methodology
Reweighting
Inverse Proportional Function
Extreme Value Theory
Meta Learning
Theoretical convergence analysis of meta-learning reweighting
Fine-tuning
Experiments
Datasets
...and 9 more sections

Key Result

Lemma 1

Suppose the evaluation loss function is Lipschitz-smooth with constant $L$, and the train loss function $\ell_i$ of training data $x_i$ has $\sigma$-bounded gradients. Let the learning rate $\phi$ satisfy $\phi \leq \frac{2n}{L \sigma^2}$, where $n$ is the training batch size. Following our algorith where $\mathcal{L}(\theta)$ is the total evaluation loss The equality $\mathcal{L}(\theta_{t+1}) =

Figures (8)

Figure 1: Out-of-Distribution (OOD) problem showing the disparity between the training and test sets. The gray dashed line represents the threshold (95$^{th}$ percentile) to separate normal and extreme samples.
Figure 2: Illustration of data distribution. $\Phi$ and $\Phi'$ are the probability distribution functions of the training and evaluation set. The dashed line refers to the threshold to split extreme and normal samples. The oval sizes represent the set sizes.
Figure 3: Training process of the unweighted framework, the reweighting approach, and the fine-tuning method. The ovals represent the sample spaces; the small circles represent individual inputs, while their sizes denote their weights; $\textbf{y}_i$ and $\hat{\textbf{y}}_i$ are the ground truth and prediction values, respectively. Trainable models are marked with the "fire" symbol in the upper left corner; individual layers are marked as trainable or frozen during fine-tuning.
Figure 4: A schematic of the meta-learning-based reweighting method.
Figure 5: Visualization of truth and prediction. "Unweighted" is the baseline model without reweighting and fine-tuning, while the last three are with reweighting and fine-tuning.
...and 3 more figures

Theorems & Definitions (6)

Definition 3.1: Time Series Prediction
Definition 3.2: Extreme Events
Definition 3.3: Long-tailed Distributions
Definition 4.1: $\sigma$-bounded gradients garrigos2023handbook
Lemma 1
proof

ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning

TL;DR

Abstract

ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (6)