Table of Contents
Fetching ...

Harmonic Dataset Distillation for Time Series Forecasting

Seungha Hong, Sanghwan Jang, Wonbin Kweon, Suyeon Kim, Gyuseok Lee, Hwanjo Yu

TL;DR

Harmonic Dataset Distillation for Time Series Forecasting (HDT) decomposes the time series into its sinusoidal basis through the FFT and aligns the core periodic structure by Harmonic Matching, validating its practicality for large-scale, real-world applications.

Abstract

Time Series forecasting (TSF) in the modern era faces significant computational and storage cost challenges due to the massive scale of real-world data. Dataset Distillation (DD), a paradigm that synthesizes a small, compact dataset to achieve training performance comparable to that of the original dataset, has emerged as a promising solution. However, conventional DD methods are not tailored for time series and suffer from architectural overfitting and limited scalability. To address these issues, we propose Harmonic Dataset Distillation for Time Series Forecasting (HDT). HDT decomposes the time series into its sinusoidal basis through the FFT and aligns the core periodic structure by Harmonic Matching. Since this process operates in the frequency domain, all updates during distillation are applied globally without disrupting temporal dependencies of time series. Extensive experiments demonstrate that HDT achieves strong cross-architecture generalization and scalability, validating its practicality for large-scale, real-world applications.

Harmonic Dataset Distillation for Time Series Forecasting

TL;DR

Harmonic Dataset Distillation for Time Series Forecasting (HDT) decomposes the time series into its sinusoidal basis through the FFT and aligns the core periodic structure by Harmonic Matching, validating its practicality for large-scale, real-world applications.

Abstract

Time Series forecasting (TSF) in the modern era faces significant computational and storage cost challenges due to the massive scale of real-world data. Dataset Distillation (DD), a paradigm that synthesizes a small, compact dataset to achieve training performance comparable to that of the original dataset, has emerged as a promising solution. However, conventional DD methods are not tailored for time series and suffer from architectural overfitting and limited scalability. To address these issues, we propose Harmonic Dataset Distillation for Time Series Forecasting (HDT). HDT decomposes the time series into its sinusoidal basis through the FFT and aligns the core periodic structure by Harmonic Matching. Since this process operates in the frequency domain, all updates during distillation are applied globally without disrupting temporal dependencies of time series. Extensive experiments demonstrate that HDT achieves strong cross-architecture generalization and scalability, validating its practicality for large-scale, real-world applications.
Paper Structure (45 sections, 1 theorem, 29 equations, 7 figures, 9 tables, 1 algorithm)

This paper contains 45 sections, 1 theorem, 29 equations, 7 figures, 9 tables, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{F_X}$ and $\mathcal{F_S}$ are the DFTs of an $M$-point subset (segment) of $\mathcal{X}$ and $\mathcal{S}$, and let $r_\mathcal{X}(k)$ and $r_\mathcal{S}(k)$ denote their respective ACFs at lag $k$. Suppose we choose $\mathcal{S}$ so as to minimize $\|\:|\mathcal{F_X}| - |\mathcal{F_S} where $\varepsilon$ is a measure of how closely $\mathcal{F_S}$ approximates $\mathcal{F_X}$ in the

Figures (7)

  • Figure 1: Illustrative examples of distilling an original dataset $\mathcal{X}$ into a synthetic dataset $\mathcal{S}$ with (a) Window-based and (b) Harmonic Dataset Distillation for TSF. In (a), windows randomly sampled from $\mathcal{X}$ are distilled into arbitrary positions within $\mathcal{S}$. In (b), the sequence is decomposed into a sinusoidal basis, and selected harmonics ($\mathcal{H}_i$) between $\mathcal{X}$ and $\mathcal{S}$ are aligned during distillation.
  • Figure 2: An overview of our method. Selected harmonics (red box) of synthetic data are updated through Harmonic Matching and Gradient Matching.
  • Figure 3: Scalability and cross-architecture generalization performance on the ETTh1 and Traffic datasets. The left two plots show scalability results for varying synthetic data sizes ($M$). The right two plots compare the performance between fixed- and cross-architecture settings, highlighting that HDT maintains a significantly smaller increase in MSE compared to others.
  • Figure 4: Visual comparison of distilled datasets generated by different methods and backbones on ETTh1.
  • Figure 5: Visual comparison of distilled datasets generated by different methods and backbones on ETTm1.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof : Proof