LENS: Large Pre-trained Transformer for Exploring Financial Time Series Regularities
Yuanjian Xu, Anxian Liu, Jianing Hao, Zhenzhuo Li, Shichang Meng, Guang Zhang
TL;DR
LENS addresses the challenge of modeling highly stochastic financial time series by introducing a domain-specific foundation model that combines an invertible patch-based embedding with a TimeFormer encoder-decoder and specialized time-aware and channel-aware attention. The model is trained in two stages—noise-robust invertible embedding with contrastive and reconstruction losses, followed by multi-task pretraining of the TimeFormer on varied input-output lengths—and pre-trained on over 100 billion financial observations. Empirical results across long-term forecasting, imputation, and portfolio management demonstrate strong generalization and improved performance over a wide range of baselines, with ablations and embedding-space analyses shedding light on the contributions of each component. The work provides practical guidance for building large-scale, noise-robust financial time series models and establishes a foundation for future enhancements in handling abrupt market dynamics and regime changes.
Abstract
Modeling large-scale time series has gained significant attention in recent years. However, its direct application in finance remains challenging due to substantial differences in data characteristics across domains. Specifically, financial systems feature inherent stochasticity and low signal-to-noise ratios, rendering traditional methods and pre-training approaches ineffective. This underscores the urgent need for a foundation model tailored to financial time series. To bridge this gap, we propose \textbf{LENS}, a pre-trained model for this domain. \textbf{LENS} effectively captures the complexity of financial stochastic systems through a carefully crafted model architecture and mitigates noise during pre-training by using an invertible embedding module. We provide a rigorous theoretical explanation of the model's effectiveness and validate its performance through extensive experiments. Pre-trained on a dataset comprising 100 billion financial observations, \textbf{LENS} achieves exceptional results across a wide range of critical downstream tasks. Moreover, our work offers practical insights into developing pre-trained time series models in high-noise environments, paving the way for further advancements in this pivotal research domain.
