Transformer for Times Series: an Application to the S&P500

Pierre Brugiere; Gabriel Turinici

Transformer for Times Series: an Application to the S&P500

Pierre Brugiere, Gabriel Turinici

TL;DR

This work extends Transformer encoder architectures to financial time series by embedding univariate observations and employing a classification objective over discretized future values. It demonstrates that the encoder can learn conditional distributions in a synthetic Ornstein–Uhlenbeck setting and extract volatility-related signals from real S&P 500 data, with meaningful improvements in predicting next-day quadratic variation over naive baselines. Key findings include near-best performance on OU data (approximate bucket accuracy of ~32%) and modest but promising volatility predictions on real data, while direct next-return prediction remains challenging in the baseline setup. The results suggest practical potential for transformer-based time-series modeling in finance, particularly for volatility and risk assessment tasks, and point to avenues for embedding refinements and architectural tuning to boost predictive power.

Abstract

The transformer models have been extensively used with good results in a wide area of machine learning applications including Large Language Models and image generation. Here, we inquire on the applicability of this approach to financial time series. We first describe the dataset construction for two prototypical situations: a mean reverting synthetic Ornstein-Uhlenbeck process on one hand and real S&P500 data on the other hand. Then, we present in detail the proposed Transformer architecture and finally we discuss some encouraging results. For the synthetic data we predict rather accurately the next move, and for the S&P500 we get some interesting results related to quadratic variation and volatility prediction.

Transformer for Times Series: an Application to the S&P500

TL;DR

Abstract

Paper Structure (21 sections, 13 equations, 9 figures, 3 tables)

This paper contains 21 sections, 13 equations, 9 figures, 3 tables.

Objectives and general introduction
Methodology
First notations and time series embedding
Creating a dataset from a single time series
Ornstein Uhlenbeck process for the $y_i$
Positional encoding
The Transformer model for classification
Neural network model details
Analysis of the different layers
The Multi Head Transformer-Encoder
Performance of the model
Crossentropy
Categorical accuracy
Pointwise analysis of the predictions
Parameters of the model
...and 6 more sections

Figures (9)

Figure 1: Example of variables used in the simulations.
Figure 2: Trajectory of the process \ref{['eq:ouprocess']} for the parameters $dt=1$, $\theta=1$, $\sigma=1$, $\mu=0$. Left: first 301 hidden values $h_0, ...,h_{300}$. Right: first 300 values $y_1, ...,y_{300}$.
Figure 3: Structure of the program.
Figure 4: Predictions for the synthetic stochastic process \ref{['eq:ouprocess']} with 24131 observations, 30 epochs.
Figure 5: Predictions for the synthetic stochastic process \ref{['eq:ouprocess']} with 241310 observations, 40 epochs.
...and 4 more figures

Theorems & Definitions (1)

proof

Transformer for Times Series: an Application to the S&P500

TL;DR

Abstract

Transformer for Times Series: an Application to the S&P500

Authors

TL;DR

Abstract

Table of Contents

Figures (9)

Theorems & Definitions (1)