Table of Contents
Fetching ...

Sundial: A Family of Highly Capable Time Series Foundation Models

Yong Liu, Guo Qin, Zhiyuan Shi, Zhi Chen, Caiyin Yang, Xiangdong Huang, Jianmin Wang, Mingsheng Long

TL;DR

Sundial addresses the non-determinism of time series by learning a flexible generative model that conditions on history to sample from $p(x_{t+1:t+f}|oldsymbol{h}_t)$. It uses TimeFlow Loss within a flow-matching framework to train a decoder-only Transformer on continuous-valued sequences without discrete tokenization, enabling multiple plausible futures with fast test-time generation. The approach is bolstered by patch-based tokenization, RoPE-enhanced attention, and a trillion-point TimeBench pre-training corpus, delivering state-of-the-art zero-shot performance on both point and probabilistic benchmarks. Together, these contributions unlock scalable, reliable, and efficient generative forecasting for real-world decision-making across domains such as weather, energy, and finance.

Abstract

We introduce Sundial, a family of native, flexible, and scalable time series foundation models. To predict the next-patch's distribution, we propose a TimeFlow Loss based on flow-matching, which facilitates native pre-training of Transformers on continuous-valued time series without discrete tokenization. Conditioned on arbitrary-length time series, our models are pre-trained without specifying any prior distribution and can generate multiple probable predictions, achieving more flexibility in representation learning than using parametric densities. Towards time series foundation models, we leverage minimal but crucial adaptations of Transformers and curate TimeBench with one trillion time points, comprising mostly real-world datasets and synthetic data. By mitigating mode collapse via TimeFlow Loss, we pre-train a family of Sundial models on TimeBench, which achieve unprecedented model capacity and generalization performance. In addition to excellent scalability, Sundial achieves state-of-the-art results on both point and probabilistic forecasting benchmarks with a just-in-time inference speed, i.e., making zero-shot predictions within a few milliseconds. We believe that Sundial's pioneering generative forecasting capability can improve model reliability in real-world decision-making. Code is available at: https://github.com/thuml/Sundial.

Sundial: A Family of Highly Capable Time Series Foundation Models

TL;DR

Sundial addresses the non-determinism of time series by learning a flexible generative model that conditions on history to sample from . It uses TimeFlow Loss within a flow-matching framework to train a decoder-only Transformer on continuous-valued sequences without discrete tokenization, enabling multiple plausible futures with fast test-time generation. The approach is bolstered by patch-based tokenization, RoPE-enhanced attention, and a trillion-point TimeBench pre-training corpus, delivering state-of-the-art zero-shot performance on both point and probabilistic benchmarks. Together, these contributions unlock scalable, reliable, and efficient generative forecasting for real-world decision-making across domains such as weather, energy, and finance.

Abstract

We introduce Sundial, a family of native, flexible, and scalable time series foundation models. To predict the next-patch's distribution, we propose a TimeFlow Loss based on flow-matching, which facilitates native pre-training of Transformers on continuous-valued time series without discrete tokenization. Conditioned on arbitrary-length time series, our models are pre-trained without specifying any prior distribution and can generate multiple probable predictions, achieving more flexibility in representation learning than using parametric densities. Towards time series foundation models, we leverage minimal but crucial adaptations of Transformers and curate TimeBench with one trillion time points, comprising mostly real-world datasets and synthetic data. By mitigating mode collapse via TimeFlow Loss, we pre-train a family of Sundial models on TimeBench, which achieve unprecedented model capacity and generalization performance. In addition to excellent scalability, Sundial achieves state-of-the-art results on both point and probabilistic forecasting benchmarks with a just-in-time inference speed, i.e., making zero-shot predictions within a few milliseconds. We believe that Sundial's pioneering generative forecasting capability can improve model reliability in real-world decision-making. Code is available at: https://github.com/thuml/Sundial.

Paper Structure

This paper contains 47 sections, 12 equations, 15 figures, 9 tables, 1 algorithm.

Figures (15)

  • Figure 1: A native time series model operates on the original series of continuous values. A flexible foundation model is pre-trained without specifying prior distributions. Sundial is the first family of native and flexible time series foundation models.
  • Figure 2: Overall architecture of Sundial. The input time series is divided into patch tokens, which are embedded from original continuous values. The patch embeddings are fed into a decoder-only Transformer, a stable and speedup version that learns token representations via causal self-attention. The model is optimized using our TimeFlow Loss, a parameterized loss function that models per-token probability distribution conditioned on the learned representations, and generates multiple plausible predictions under the flow-matching framework.
  • Figure 3: Ratios of data sources in TimeBench, the pre-training corpora of Sundial. Detailed statistics are provide in Table \ref{['tab:dataset_summary']}.
  • Figure 4: Model evaluation on the FEV leaderboard, which includes $27$ datasets not seen by Sundial. Baseline models can be categorized into statistical methods fitting on each time series, task-specific deep models trained on each dataset, and pre-trained foundation models. Pre-trained Models that have seen several datasets during pre-training are denoted as Pre-trained Models (Other). A lower MASE/WQL indicates a better result. Sundial makes probabilistic predictions using $20$ generated series, being consistent with ansari2024chronos.
  • Figure 5: Inference time evaluation following ansari2024chronos, which is averaged from the FEV leaderboard. Computing resources of different models are marked. We plot the logarithmic x-axis.
  • ...and 10 more figures