TiCT: A Synthetically Pre-Trained Foundation Model for Time Series Classification
Chin-Chia Michael Yeh, Uday Singh Saini, Junpeng Wang, Xin Dai, Xiran Fan, Jiarui Sun, Yujie Fan, Yan Zheng
TL;DR
TiCT tackles the need for versatile, fine-tuning-free time-series classifiers by training a transformer-based model solely on synthetic data to perform in-context learning (ICL) for classification. It introduces a scalable bit-based label encoding plus a dedicated output attention mechanism, and pairs this with a Mixup-inspired synthetic pre-training framework augmented with time-series distortions to promote generalization. Empirical results on the UCR Archive show TiCT achieving competitive accuracy with state-of-the-art supervised methods while requiring no weight updates, with the final logits obeying $l[c] = \sum_{i \in \mathcal{I}_c} \alpha_i$, where $\mathcal{I}_c$ indexes context samples of class $c$. This work demonstrates a practical path to general-purpose, in-context time-series classifiers and suggests potential extensions to other domains and modalities.
Abstract
The ubiquity of time series data creates a strong demand for general-purpose foundation models, yet developing them for classification remains a significant challenge, largely due to the high cost of labeled data. Foundation models capable of in-context learning (ICL) offer a powerful solution, adapting to new tasks with minimal examples and reducing the need for extensive retraining. However, prior work on large-scale time series models has predominantly focused on forecasting, leaving a critical gap for versatile, fine-tuning-free classification. To address this, we introduce TiCT (Time-series in-Context Transformer), a transformer-based model pre-trained exclusively on synthetic data to perform in-context classification. We make two primary technical contributions: 1) a novel architecture featuring a scalable bit-based label encoding and a special output attention mechanism to handle an arbitrary number of classes; and 2) a synthetic pre-training framework that combines a Mixup-inspired process with data augmentation to foster generalization and noise invariance. Extensive evaluations on the UCR Archive show that TiCT achieves competitive performance against state-of-the-art supervised methods. Crucially, this is accomplished using only in-context examples at inference time, without updating a single model weight.
