ShapeCond: Fast Shapelet-Guided Dataset Condensation for Time Series Classification
Sijia Peng, Yun Xiong, Xi Chen, Yi Xie, Guanzhi Li, Yanwei Yu, Yangyong Zhu, Zhiqiang Shen
TL;DR
ShapeCond tackles the growth of time series data by introducing a shapelet-guided dataset condensation framework that preserves both local discriminative motifs and global temporal structure. It jointly optimizes a compact synthesized set via a dual-view process: global dynamics guided by a frozen teacher encoder and local motif constraints enforced through shapelet transforms, with BatchNorm statistics matched to the full data and soft teacher labels used for supervision. The approach achieves up to 29× speedups in synthesis, outperforms all prior time-series condensation methods on seven datasets, and enables effective downstream tasks such as neural architecture search with significantly reduced data. This work demonstrates that explicitly preserving shapelet knowledge in condensed data yields substantial accuracy gains while markedly reducing storage and computation, offering a scalable strategy for temporal data modeling in resource-constrained settings.
Abstract
Time series data supports many domains (e.g., finance and climate science), but its rapid growth strains storage and computation. Dataset condensation can alleviate this by synthesizing a compact training set that preserves key information. Yet most condensation methods are image-centric and often fail on time series because they miss time-series-specific temporal structure, especially local discriminative motifs such as shapelets. In this work, we propose ShapeCond, a novel and efficient condensation framework for time series classification that leverages shapelet-based dataset knowledge via a shapelet-guided optimization strategy. Our shapelet-assisted synthesis cost is independent of sequence length: longer series yield larger speedups in synthesis (e.g., 29$\times$ faster over prior state-of-the-art method CondTSC for time-series condensation, and up to 10,000$\times$ over naively using shapelets on the Sleep dataset with 3,000 timesteps). By explicitly preserving critical local patterns, ShapeCond improves downstream accuracy and consistently outperforms all prior state-of-the-art time series dataset condensation methods across extensive experiments. Code is available at https://github.com/lunaaa95/ShapeCond.
