Table of Contents
Fetching ...

Integration of LSTM Networks in Random Forest Algorithms for Stock Market Trading Predictions

Juan C. King, Jose M. Amigo

TL;DR

The paper investigates whether fundamental, technical, or hybrid models can reliably predict stock-price direction over a 10‑day horizon. It implements three model families: fundamental models based on Random Forest, Gradient Boosting, and Neural Networks; technical models using asset-specific LSTMs on price indicators; and a hybrid model that ingests technical outputs into a fundamental model. Results show that fundamental Random Forest provides solid performance, technical LSTMs yield limited average power with notable asset variance, and the hybrid offers only modest gains unless high-quality LSTM inputs are selectively used, at which point performance can rise substantially. A simulated value-investing strategy demonstrates the practical potential of the approach, though computational cost and data accessibility remain key considerations for real-world deployment.

Abstract

The aim of this paper is the analysis and selection of stock trading systems that combine different models with data of different nature, such as financial and microeconomic information. Specifically, based on previous work by the authors and applying advanced techniques of Machine Learning and Deep Learning, our objective is to formulate trading algorithms for the stock market with empirically tested statistical advantages, thus improving results published in the literature. Our approach integrates Long Short-Term Memory (LSTM) networks with algorithms based on decision trees, such as Random Forest and Gradient Boosting. While the former analyze price patterns of financial assets, the latter are fed with economic data of companies. Numerical simulations of algorithmic trading with data from international companies and 10-weekday predictions confirm that an approach based on both fundamental and technical variables can outperform the usual approaches, which do not combine those two types of variables. In doing so, Random Forest turned out to be the best performer among the decision trees. We also discuss how the prediction performance of such a hybrid approach can be boosted by selecting the technical variables.

Integration of LSTM Networks in Random Forest Algorithms for Stock Market Trading Predictions

TL;DR

The paper investigates whether fundamental, technical, or hybrid models can reliably predict stock-price direction over a 10‑day horizon. It implements three model families: fundamental models based on Random Forest, Gradient Boosting, and Neural Networks; technical models using asset-specific LSTMs on price indicators; and a hybrid model that ingests technical outputs into a fundamental model. Results show that fundamental Random Forest provides solid performance, technical LSTMs yield limited average power with notable asset variance, and the hybrid offers only modest gains unless high-quality LSTM inputs are selectively used, at which point performance can rise substantially. A simulated value-investing strategy demonstrates the practical potential of the approach, though computational cost and data accessibility remain key considerations for real-world deployment.

Abstract

The aim of this paper is the analysis and selection of stock trading systems that combine different models with data of different nature, such as financial and microeconomic information. Specifically, based on previous work by the authors and applying advanced techniques of Machine Learning and Deep Learning, our objective is to formulate trading algorithms for the stock market with empirically tested statistical advantages, thus improving results published in the literature. Our approach integrates Long Short-Term Memory (LSTM) networks with algorithms based on decision trees, such as Random Forest and Gradient Boosting. While the former analyze price patterns of financial assets, the latter are fed with economic data of companies. Numerical simulations of algorithmic trading with data from international companies and 10-weekday predictions confirm that an approach based on both fundamental and technical variables can outperform the usual approaches, which do not combine those two types of variables. In doing so, Random Forest turned out to be the best performer among the decision trees. We also discuss how the prediction performance of such a hybrid approach can be boosted by selecting the technical variables.

Paper Structure

This paper contains 19 sections, 4 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Train and Test ROC curves of the fundamental best model (Random Forest).
  • Figure 2: Train and Test ROC curves in cross validation (Random Forest).
  • Figure 3: Scheme of the hybrid model.
  • Figure 4: Train and Test ROC curves for the hybrid model.
  • Figure 5: Top 20 feature importance ranking for the hybrid model. The variable "lstm_prediction" is included at position 15 with an importance of 0.025.
  • ...and 3 more figures