Table of Contents
Fetching ...

Transformer Based Time-Series Forecasting for Stock

Shuozhe Li, Zachery B Schulwol, Risto Miikkulainen

TL;DR

This work introduces Stockformer, a Transformer-based, multivariate time-series forecasting model for hourly stock-price prediction. By leveraging attention mechanisms and causal relationships among securities, it aims to outperform traditional methods and LSTMs while providing trading-oriented loss functions and ROI-based evaluation. The approach emphasizes data-quality-aware preprocessing, architecture choices between full and sparse attention, and training strategies to stabilize optimization. Early experiments show promising profit potential and practical relevance, though the authors acknowledge the need for broader tickers, dynamic retraining, and enhanced temporal encoding to realize robust, real-world gains.

Abstract

To the naked eye, stock prices are considered chaotic, dynamic, and unpredictable. Indeed, it is one of the most difficult forecasting tasks that hundreds of millions of retail traders and professional traders around the world try to do every second even before the market opens. With recent advances in the development of machine learning and the amount of data the market generated over years, applying machine learning techniques such as deep learning neural networks is unavoidable. In this work, we modeled the task as a multivariate forecasting problem, instead of a naive autoregression problem. The multivariate analysis is done using the attention mechanism via applying a mutated version of the Transformer, "Stockformer", which we created.

Transformer Based Time-Series Forecasting for Stock

TL;DR

This work introduces Stockformer, a Transformer-based, multivariate time-series forecasting model for hourly stock-price prediction. By leveraging attention mechanisms and causal relationships among securities, it aims to outperform traditional methods and LSTMs while providing trading-oriented loss functions and ROI-based evaluation. The approach emphasizes data-quality-aware preprocessing, architecture choices between full and sparse attention, and training strategies to stabilize optimization. Early experiments show promising profit potential and practical relevance, though the authors acknowledge the need for broader tickers, dynamic retraining, and enhanced temporal encoding to realize robust, real-world gains.

Abstract

To the naked eye, stock prices are considered chaotic, dynamic, and unpredictable. Indeed, it is one of the most difficult forecasting tasks that hundreds of millions of retail traders and professional traders around the world try to do every second even before the market opens. With recent advances in the development of machine learning and the amount of data the market generated over years, applying machine learning techniques such as deep learning neural networks is unavoidable. In this work, we modeled the task as a multivariate forecasting problem, instead of a naive autoregression problem. The multivariate analysis is done using the attention mechanism via applying a mutated version of the Transformer, "Stockformer", which we created.

Paper Structure

This paper contains 39 sections, 7 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: Finding dependencies on two positions in RNN requires traversing through recurrent unit;
  • Figure 2: The inputs here are the price information which can be percents of change or real price value; The output can be the price information or just trend;
  • Figure 3: The stock prices of four American multinational oil and gas corporations in the past 5 years;
  • Figure 4: The sequence length stays as n, but more sequence patterns have been extracted in each output channel;
  • Figure 5: Encoder layers stack together;
  • ...and 8 more figures