MSDformer: Multi-scale Discrete Transformer For Time Series Generation
Zhicheng Chen, Shibo Feng, Xi Xiao, Zhong Zhang, Qing Li, Xingyu Gao, Peilin Zhao
TL;DR
MSDformer tackles synthetic time series generation by introducing a two-stage, discrete-token framework that captures multi-scale temporal patterns. It combines a multi-scale time series tokenizer built from cascaded VQ-VAEs with a multi-scale autoregressive Transformer to model token sequences, enabling coarse-to-fine generation in the discrete latent space. The authors ground the approach in rate-distortion theory, showing that DTM affords explicit control over distortion via codebook size, and that multi-scale modeling increases the effective rate to reduce distortion. Empirically, MSDformer and its predecessor SDformer outperform GAN-, VAE-, and DDPM-based baselines across six datasets, with MSDformer delivering substantial gains in long-term generation and fidelity, while maintaining reasonable inference efficiency. The work suggests that multi-scale DTM is a powerful paradigm for time series synthesis and points to future extensions in adaptive scaling and spatiotemporal generation.
Abstract
Discrete Token Modeling (DTM), which employs vector quantization techniques, has demonstrated remarkable success in modeling non-natural language modalities, particularly in time series generation. While our prior work SDformer established the first DTM-based framework to achieve state-of-the-art performance in this domain, two critical limitations persist in existing DTM approaches: 1) their inability to capture multi-scale temporal patterns inherent to complex time series data, and 2) the absence of theoretical foundations to guide model optimization. To address these challenges, we proposes a novel multi-scale DTM-based time series generation method, called Multi-Scale Discrete Transformer (MSDformer). MSDformer employs a multi-scale time series tokenizer to learn discrete token representations at multiple scales, which jointly characterize the complex nature of time series data. Subsequently, MSDformer applies a multi-scale autoregressive token modeling technique to capture the multi-scale patterns of time series within the discrete latent space. Theoretically, we validate the effectiveness of the DTM method and the rationality of MSDformer through the rate-distortion theorem. Comprehensive experiments demonstrate that MSDformer significantly outperforms state-of-the-art methods. Both theoretical analysis and experimental results demonstrate that incorporating multi-scale information and modeling multi-scale patterns can substantially enhance the quality of generated time series in DTM-based approaches. The code will be released upon acceptance.
