Table of Contents
Fetching ...

Less Is More: Generating Time Series with LLaMA-Style Autoregression in Simple Factorized Latent Spaces

Siyuan Li, Yifan Sun, Lei Cheng, Lewen Wang, Yang Liu, Weiqing Liu, Jianlong Li, Jiang Bian, Shikai Fang

TL;DR

FAR-TS targets fast and flexible generation of multivariate time series by disentangling spatial and temporal structure into a learnable basis and discrete temporal tokens. A VQ-based Stage I learns a factorized latent space, and a LLaMA-style autoregressive Transformer in Stage II models the token sequence to generate variable-length series, enabling conditional generation and forecasting. Empirical results across multiple datasets show FAR-TS outperforms diffusion-based and autoregressive baselines while offering significantly faster sampling and interpretable latent representations. This approach advances practical time series synthesis for data augmentation, simulation, and privacy-preserving tasks with scalable, controllable outputs.

Abstract

Generative models for multivariate time series are essential for data augmentation, simulation, and privacy preservation, yet current state-of-the-art diffusion-based approaches are slow and limited to fixed-length windows. We propose FAR-TS, a simple yet effective framework that combines disentangled factorization with an autoregressive Transformer over a discrete, quantized latent space to generate time series. Each time series is decomposed into a data-adaptive basis that captures static cross-channel correlations and temporal coefficients that are vector-quantized into discrete tokens. A LLaMA-style autoregressive Transformer then models these token sequences, enabling fast and controllable generation of sequences with arbitrary length. Owing to its streamlined design, FAR-TS achieves orders-of-magnitude faster generation than Diffusion-TS while preserving cross-channel correlations and an interpretable latent space, enabling high-quality and flexible time series synthesis.

Less Is More: Generating Time Series with LLaMA-Style Autoregression in Simple Factorized Latent Spaces

TL;DR

FAR-TS targets fast and flexible generation of multivariate time series by disentangling spatial and temporal structure into a learnable basis and discrete temporal tokens. A VQ-based Stage I learns a factorized latent space, and a LLaMA-style autoregressive Transformer in Stage II models the token sequence to generate variable-length series, enabling conditional generation and forecasting. Empirical results across multiple datasets show FAR-TS outperforms diffusion-based and autoregressive baselines while offering significantly faster sampling and interpretable latent representations. This approach advances practical time series synthesis for data augmentation, simulation, and privacy-preserving tasks with scalable, controllable outputs.

Abstract

Generative models for multivariate time series are essential for data augmentation, simulation, and privacy preservation, yet current state-of-the-art diffusion-based approaches are slow and limited to fixed-length windows. We propose FAR-TS, a simple yet effective framework that combines disentangled factorization with an autoregressive Transformer over a discrete, quantized latent space to generate time series. Each time series is decomposed into a data-adaptive basis that captures static cross-channel correlations and temporal coefficients that are vector-quantized into discrete tokens. A LLaMA-style autoregressive Transformer then models these token sequences, enabling fast and controllable generation of sequences with arbitrary length. Owing to its streamlined design, FAR-TS achieves orders-of-magnitude faster generation than Diffusion-TS while preserving cross-channel correlations and an interpretable latent space, enabling high-quality and flexible time series synthesis.

Paper Structure

This paper contains 38 sections, 7 equations, 9 figures, 13 tables.

Figures (9)

  • Figure 1: Inference time versus Discriminative Score (left) and Context-FID (right) on the ETTh dataset. For both metrics, lower values indicate better performance, so models closer to the lower-left corner perform best. Bubble size denotes model size, and dashed lines mark the results of the corresponding models. The proposed FAR-TS achieves on average a 50% performance gain with orders-of-magnitude faster generation and shows better scalability than Diffusion-TS.
  • Figure 2: Overview of the FAR-TS pipeline with two training stages and inference. Components marked with a lock are frozen during that stage. Stage 1: A pointwise MLP encoder maps each time step of $X \in \mathbb{R}^{T \times D}$ to coefficient vectors, which are quantized into tokens $z$ via a shared codebook. Stage 2: A LLaMA-style autoregressive models the token sequence. Inference: sampled tokens $\hat{z}$ are mapped to coefficients $\hat{V}$, combined with a learnable spatial basis $U \in \mathbb{R}^{D \times R}$ to reconstruct $\tilde{X}$, and refined by a residual decoder.
  • Figure 3: Visualizations of the time series generated by TimeVQVAE and FAR-TS.
  • Figure 4: (a) Runtime of Diffusion-TS and FAR-TS with different model sizes. (b) Samples from the fMRI dataset, along with the learned basis functions and temporal coefficients of FAR-TS.
  • Figure 5: Visualizations of the time series synthesized by FAR-TS, TimeVQVAE, and Diffusion-TS on different length of ETTh.
  • ...and 4 more figures