Table of Contents
Fetching ...

Mamba Outpaces Reformer in Stock Prediction with Sentiments from Top Ten LLMs

Lokesh Antony Kadiyala, Amir Mirzaeinia

TL;DR

This work tackles intraday stock price forecasting by fusing semantic sentiment signals from ten large language models with 1‑minute Apple price data. It compares two efficient sequence models, Mamba (state-space) and Reformer (LSH Transformer), across ten LLM sources using a carefully constructed minute-level dataset and Optuna-tuned hyperparameters. Results show that Mamba consistently achieves superior short-term accuracy, with the best pairing being Mamba + LLaMA 3.3–70B (MSE ≈ 0.137), while Reformer attains its best at MSE ≈ 2.647 with Qwen Turbo; overall, Mamba outperforms Reformer in most LLM settings. The study demonstrates that LLM-derived sentiment can meaningfully enhance real-time financial forecasting when combined with architectures optimized for dense temporal data, suggesting practical potential for sentiment-guided intraday trading signals.

Abstract

The stock market is extremely difficult to predict in the short term due to high market volatility, changes caused by news, and the non-linear nature of the financial time series. This research proposes a novel framework for improving minute-level prediction accuracy using semantic sentiment scores from top ten different large language models (LLMs) combined with minute interval intraday stock price data. We systematically constructed a time-aligned dataset of AAPL news articles and 1-minute Apple Inc. (AAPL) stock prices for the dates of April 4 to May 2, 2025. The sentiment analysis was achieved using the DeepSeek-V3, GPT variants, LLaMA, Claude, Gemini, Qwen, and Mistral models through their APIs. Each article obtained sentiment scores from all ten LLMs, which were scaled to a [0, 1] range and combined with prices and technical indicators like RSI, ROC, and Bollinger Band Width. Two state-of-the-art such as Reformer and Mamba were trained separately on the dataset using the sentiment scores produced by each LLM as input. Hyper parameters were optimized by means of Optuna and were evaluated through a 3-day evaluation period. Reformer had mean squared error (MSE) or the evaluation metrics, and it should be noted that Mamba performed not only faster but also better than Reformer for every LLM across the 10 LLMs tested. Mamba performed best with LLaMA 3.3--70B, with the lowest error of 0.137. While Reformer could capture broader trends within the data, the model appeared to over smooth sudden changes by the LLMs. This study highlights the potential of integrating LLM-based semantic analysis paired with efficient temporal modeling to enhance real-time financial forecasting.

Mamba Outpaces Reformer in Stock Prediction with Sentiments from Top Ten LLMs

TL;DR

This work tackles intraday stock price forecasting by fusing semantic sentiment signals from ten large language models with 1‑minute Apple price data. It compares two efficient sequence models, Mamba (state-space) and Reformer (LSH Transformer), across ten LLM sources using a carefully constructed minute-level dataset and Optuna-tuned hyperparameters. Results show that Mamba consistently achieves superior short-term accuracy, with the best pairing being Mamba + LLaMA 3.3–70B (MSE ≈ 0.137), while Reformer attains its best at MSE ≈ 2.647 with Qwen Turbo; overall, Mamba outperforms Reformer in most LLM settings. The study demonstrates that LLM-derived sentiment can meaningfully enhance real-time financial forecasting when combined with architectures optimized for dense temporal data, suggesting practical potential for sentiment-guided intraday trading signals.

Abstract

The stock market is extremely difficult to predict in the short term due to high market volatility, changes caused by news, and the non-linear nature of the financial time series. This research proposes a novel framework for improving minute-level prediction accuracy using semantic sentiment scores from top ten different large language models (LLMs) combined with minute interval intraday stock price data. We systematically constructed a time-aligned dataset of AAPL news articles and 1-minute Apple Inc. (AAPL) stock prices for the dates of April 4 to May 2, 2025. The sentiment analysis was achieved using the DeepSeek-V3, GPT variants, LLaMA, Claude, Gemini, Qwen, and Mistral models through their APIs. Each article obtained sentiment scores from all ten LLMs, which were scaled to a [0, 1] range and combined with prices and technical indicators like RSI, ROC, and Bollinger Band Width. Two state-of-the-art such as Reformer and Mamba were trained separately on the dataset using the sentiment scores produced by each LLM as input. Hyper parameters were optimized by means of Optuna and were evaluated through a 3-day evaluation period. Reformer had mean squared error (MSE) or the evaluation metrics, and it should be noted that Mamba performed not only faster but also better than Reformer for every LLM across the 10 LLMs tested. Mamba performed best with LLaMA 3.3--70B, with the lowest error of 0.137. While Reformer could capture broader trends within the data, the model appeared to over smooth sudden changes by the LLMs. This study highlights the potential of integrating LLM-based semantic analysis paired with efficient temporal modeling to enhance real-time financial forecasting.

Paper Structure

This paper contains 24 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Overview of the LLM-Based Stock Price Prediction Pipeline.
  • Figure 2: Architecture of the Mamba Block
  • Figure 3: Overview of the Reformer architecture
  • Figure 4: Mamba Model: AAPL minute-level prediction using LLaMA 3.3 70B sentiment scores
  • Figure 5: Mamba Model: AAPL minute-level prediction using GPT-4o Mini sentiment scores
  • ...and 2 more figures