TimePFN: Effective Multivariate Time Series Forecasting with Synthetic Data
Ege Onur Taga, M. Emrullah Ildiz, Samet Oymak
TL;DR
This paper addresses the challenge of few-shot multivariate time-series forecasting under data scarcity by introducing TimePFN, a framework that combines synthetic data priors with a cross-channel transformer architecture. It comprises two key components: (1) LMC-Synth/KernelSynth to generate large-scale, realistic synthetic MTS data with controllable inter- and intra-channel dependencies, and (2) a TimePFN architecture that performs convolutional feature extraction, patch-based tokenization, and channel-mixing attention to capture temporal and cross-channel relationships. The model, trained on synthetic data, achieves strong zero-shot and few-shot performance across nine real datasets and exhibits robust univariate generalization, with ablations confirming the importance of synthetic priors and the architectural design. This work demonstrates the viability of synthetic priors for multivariate time-series foundation-model-like forecasting and suggests a pathway toward scalable, transferable MTS foundations.
Abstract
The diversity of time series applications and scarcity of domain-specific data highlight the need for time-series models with strong few-shot learning capabilities. In this work, we propose a novel training scheme and a transformer-based architecture, collectively referred to as TimePFN, for multivariate time-series (MTS) forecasting. TimePFN is based on the concept of Prior-data Fitted Networks (PFN), which aims to approximate Bayesian inference. Our approach consists of (1) generating synthetic MTS data through diverse Gaussian process kernels and the linear coregionalization method, and (2) a novel MTS architecture capable of utilizing both temporal and cross-channel dependencies across all input patches. We evaluate TimePFN on several benchmark datasets and demonstrate that it outperforms the existing state-of-the-art models for MTS forecasting in both zero-shot and few-shot settings. Notably, fine-tuning TimePFN with as few as 500 data points nearly matches full dataset training error, and even 50 data points yield competitive results. We also find that TimePFN exhibits strong univariate forecasting performance, attesting to its generalization ability. Overall, this work unlocks the power of synthetic data priors for MTS forecasting and facilitates strong zero- and few-shot forecasting performance.
