LENS: Large Pre-trained Transformer for Exploring Financial Time Series Regularities

Yuanjian Xu; Anxian Liu; Jianing Hao; Zhenzhuo Li; Shichang Meng; Guang Zhang

LENS: Large Pre-trained Transformer for Exploring Financial Time Series Regularities

Yuanjian Xu, Anxian Liu, Jianing Hao, Zhenzhuo Li, Shichang Meng, Guang Zhang

TL;DR

LENS addresses the challenge of modeling highly stochastic financial time series by introducing a domain-specific foundation model that combines an invertible patch-based embedding with a TimeFormer encoder-decoder and specialized time-aware and channel-aware attention. The model is trained in two stages—noise-robust invertible embedding with contrastive and reconstruction losses, followed by multi-task pretraining of the TimeFormer on varied input-output lengths—and pre-trained on over 100 billion financial observations. Empirical results across long-term forecasting, imputation, and portfolio management demonstrate strong generalization and improved performance over a wide range of baselines, with ablations and embedding-space analyses shedding light on the contributions of each component. The work provides practical guidance for building large-scale, noise-robust financial time series models and establishes a foundation for future enhancements in handling abrupt market dynamics and regime changes.

Abstract

Modeling large-scale time series has gained significant attention in recent years. However, its direct application in finance remains challenging due to substantial differences in data characteristics across domains. Specifically, financial systems feature inherent stochasticity and low signal-to-noise ratios, rendering traditional methods and pre-training approaches ineffective. This underscores the urgent need for a foundation model tailored to financial time series. To bridge this gap, we propose \textbf{LENS}, a pre-trained model for this domain. \textbf{LENS} effectively captures the complexity of financial stochastic systems through a carefully crafted model architecture and mitigates noise during pre-training by using an invertible embedding module. We provide a rigorous theoretical explanation of the model's effectiveness and validate its performance through extensive experiments. Pre-trained on a dataset comprising 100 billion financial observations, \textbf{LENS} achieves exceptional results across a wide range of critical downstream tasks. Moreover, our work offers practical insights into developing pre-trained time series models in high-noise environments, paving the way for further advancements in this pivotal research domain.

LENS: Large Pre-trained Transformer for Exploring Financial Time Series Regularities

TL;DR

Abstract

Paper Structure (17 sections, 4 theorems, 12 equations, 3 figures, 7 tables)

This paper contains 17 sections, 4 theorems, 12 equations, 3 figures, 7 tables.

Introduction
Methods
Invertible Embedding Module
TimeFormer
Training Process
Experiments
Financial Data for Pretraining
Downstream Tasks and Baseline
Long-term Forecasting.
Imputation.
Portfolio Management.
Result Analysis
Ablation Study.
Exploration of Embedding Space.
Scaling Experiments.
...and 2 more sections

Key Result

Proposition 2.1

Contrastive learning can reduce the impact of noise on the embedding space. Let $f_\theta(\cdot)$ denote the representation function parameterized by $\theta$, and let $x_i$ and $x_i^+$ be noisy samples derived from clean signals $x_i^*$ and $x_i^{+*}$. For Gaussian white noise, the expected bound o where $\mathbb{E}_{\eta}$ represents the expectation over the Gaussian noise distribution, $C$ is t

Figures (3)

Figure 1: The overall architecture of LENS is illustrated in this figure. Taking the forecasting task as an example, a 3-variate time series is visualized. The shaded patches represent the forecast horizon, whose corresponding embedding is fed into LENs, an encoder-decoder structure. (A) represents the invertible contrastive learning module, while (B) illustrates the attention mechanism within the TimeFormer model, comprising Time-aware attention and Channel-aware attention.
Figure 2: Schematic illustration of three financial time series analysis tasks.
Figure 3: Forecasting Performance Comparison. This figure illustrates the forecasting results of a sample where the task involves predicting the next 196 time steps based on the previous 96 steps. The green line represents the true values, providing a reference for evaluating model performance.

Theorems & Definitions (5)

Proposition 2.1
Definition 2.1: Noisy Data
Theorem 2.1: Generalization Error Bound via Rademacher Complexity with Noise
Lemma 2.1: Rademacher Complexity of Attention Mechanisms with Noisy Data
Proposition 2.2: Comparison of Generalization Error Bounds with Noise

LENS: Large Pre-trained Transformer for Exploring Financial Time Series Regularities

TL;DR

Abstract

LENS: Large Pre-trained Transformer for Exploring Financial Time Series Regularities

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (5)