UniCL: A Universal Contrastive Learning Framework for Large Time Series Models
Jiawei Li, Jingshu Peng, Haoyang Li, Lei Chen
TL;DR
UniCL addresses the lack of labeled time-series data and limited cross-domain generalization by proposing a universal, scalable contrastive pretraining framework for time-series foundation models. It introduces a unified, trainable time-series augmentation that exploits spectral information to produce pattern-preserving, diverse, low-bias views, together with two objectives that encourage diversity while maintaining low-bias embeddings. To handle datasets with many channels, it defines a spectral_distance-based hardness metric and formulates hard-negative selection as top-k retrieval from a metric matrix, guided by LLM representations. The approach is validated on two benchmark datasets spanning eleven domains, and Time-LLaMA is pre-trained on large-scale data and open-sourced, enabling broader cross-domain applicability.
Abstract
Time-series analysis plays a pivotal role across a range of critical applications, from finance to healthcare, which involves various tasks, such as forecasting and classification. To handle the inherent complexities of time-series data, such as high dimensionality and noise, traditional supervised learning methods first annotate extensive labels for time-series data in each task, which is very costly and impractical in real-world applications. In contrast, pre-trained foundation models offer a promising alternative by leveraging unlabeled data to capture general time series patterns, which can then be fine-tuned for specific tasks. However, existing approaches to pre-training such models typically suffer from high-bias and low-generality issues due to the use of predefined and rigid augmentation operations and domain-specific data training. To overcome these limitations, this paper introduces UniCL, a universal and scalable contrastive learning framework designed for pretraining time-series foundation models across cross-domain datasets. Specifically, we propose a unified and trainable time-series augmentation operation to generate pattern-preserved, diverse, and low-bias time-series data by leveraging spectral information. Besides, we introduce a scalable augmentation algorithm capable of handling datasets with varying lengths, facilitating cross-domain pretraining. Extensive experiments on two benchmark datasets across eleven domains validate the effectiveness of UniCL, demonstrating its high generalization on time-series analysis across various fields.
