DDTime: Dataset Distillation with Spectral Alignment and Information Bottleneck for Time-Series Forecasting
Yuqi Li, Kuiye Ding, Chuanguang Yang, Hao Wang, Haoxuan Wang, Huiran Duan, Junming Liu, Yingli Tian
TL;DR
DDTime tackles two core problems in time-series dataset distillation: autocorrelation-induced bias in value-term alignment and limited synthetic diversity. It introduces a frequency-domain value-term to decorrelate horizon components and an ISIB mechanism to maximize information density across synthetic trajectories, all within a plug-in compatible with first-order condensation. Empirically, DDTime delivers about 30% relative accuracy gains across 20 benchmarks with modest overhead, and often distilled subsets outperform full-data training under favorable conditions. The framework is architecture-agnostic, offering practical guidelines for synthetic data size, balance parameter alpha, and diversity weight, making it a robust, scalable approach for TSF data condensation.
Abstract
Time-series forecasting is fundamental across many domains, yet training accurate models often requires large-scale datasets and substantial computational resources. Dataset distillation offers a promising alternative by synthesizing compact datasets that preserve the learning behavior of full data. However, extending dataset distillation to time-series forecasting is non-trivial due to two fundamental challenges: 1.temporal bias from strong autocorrelation, which leads to distorted value-term alignment between teacher and student models; and 2.insufficient diversity among synthetic samples, arising from the absence of explicit categorical priors to regularize trajectory variety. In this work, we propose DDTime, a lightweight and plug-in distillation framework built upon first-order condensation decomposition. To tackle Challenge 1, it revisits value-term alignment through temporal statistics and introduces a frequency-domain alignment mechanism to mitigate autocorrelation-induced bias, ensuring spectral consistency and temporal fidelity. To address Challenge 2, we further design an inter-sample regularization inspired by the information bottleneck principle, which enhances diversity and maximizes information density across synthetic trajectories. The combined objective is theoretically compatible with a wide range of condensation paradigms and supports stable first-order optimization. Extensive experiments on 20 benchmark datasets and diverse forecasting architectures demonstrate that DDTime consistently outperforms existing distillation methods, achieving about 30% relative accuracy gains while introducing about 2.49% computational overhead. All code and distilled datasets will be released.
