Table of Contents
Fetching ...

Stock Market Dynamics Through Deep Learning Context

Amirhossein Aminimehr, Amin Aminimehr, Hamid Moradi Kamali, Sauleh Eetemadi, Saeid Hoseinzade

TL;DR

This work tackles the inadequacy of narrow feature sets and opaque models in financial forecasting by introducing a unified feature matrix that fuses Twitter-derived signals (including sentiment, engagement, and writer influence) with historical OHLC data, targeting one-step-ahead binary classification of price movement. It evaluates CNN and CNN-LSTM architectures, finding that a CNN with the proposed feature matrix delivers superior accuracy over price-only or embedding-based baselines, and employs Lime for local interpretability to identify driving factors, notably tweet volume. The study provides instance-level, time-resolved explanations showing how market movements are influenced by social media signals, while also acknowledging limitations such as the absence of statistical significance tests for feature importance. Overall, it demonstrates that integrating broad social-media features with robust interpretability can enhance predictive performance and trust in deep learning models for financial markets, with clear avenues for extending to higher-resolution intraday data and other asset classes, including cryptocurrencies.

Abstract

Studies conducted on financial market prediction lack a comprehensive feature set that can carry a broad range of contributing factors; therefore, leading to imprecise results. Furthermore, while cooperating with the most recent innovations in explainable AI, studies have not provided an illustrative summary of market-driving factors using this powerful tool. Therefore, in this study, we propose a novel feature matrix that holds a broad range of features including Twitter content and market historical data to perform a binary classification task of one step ahead prediction. The utilization of our proposed feature matrix not only leads to improved prediction accuracy when compared to existing feature representations, but also its combination with explainable AI allows us to introduce a fresh analysis approach regarding the importance of the market-driving factors included. Thanks to the Lime interpretation technique, our interpretation study shows that the volume of tweets is the most important factor included in our feature matrix that drives the market's movements.

Stock Market Dynamics Through Deep Learning Context

TL;DR

This work tackles the inadequacy of narrow feature sets and opaque models in financial forecasting by introducing a unified feature matrix that fuses Twitter-derived signals (including sentiment, engagement, and writer influence) with historical OHLC data, targeting one-step-ahead binary classification of price movement. It evaluates CNN and CNN-LSTM architectures, finding that a CNN with the proposed feature matrix delivers superior accuracy over price-only or embedding-based baselines, and employs Lime for local interpretability to identify driving factors, notably tweet volume. The study provides instance-level, time-resolved explanations showing how market movements are influenced by social media signals, while also acknowledging limitations such as the absence of statistical significance tests for feature importance. Overall, it demonstrates that integrating broad social-media features with robust interpretability can enhance predictive performance and trust in deep learning models for financial markets, with clear avenues for extending to higher-resolution intraday data and other asset classes, including cryptocurrencies.

Abstract

Studies conducted on financial market prediction lack a comprehensive feature set that can carry a broad range of contributing factors; therefore, leading to imprecise results. Furthermore, while cooperating with the most recent innovations in explainable AI, studies have not provided an illustrative summary of market-driving factors using this powerful tool. Therefore, in this study, we propose a novel feature matrix that holds a broad range of features including Twitter content and market historical data to perform a binary classification task of one step ahead prediction. The utilization of our proposed feature matrix not only leads to improved prediction accuracy when compared to existing feature representations, but also its combination with explainable AI allows us to introduce a fresh analysis approach regarding the importance of the market-driving factors included. Thanks to the Lime interpretation technique, our interpretation study shows that the volume of tweets is the most important factor included in our feature matrix that drives the market's movements.
Paper Structure (14 sections, 2 equations, 6 figures, 3 tables)

This paper contains 14 sections, 2 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Variables used to build the feature matrix
  • Figure 2: A sample of the final feature matrix at t-1
  • Figure 3: Flow chart of the process of prediction and feature interpretation
  • Figure 4: Instance-based feature importance of Amazon through time (The line plot only illustrates the True predictions)
  • Figure 5: Instance-based feature importance of Tesla through time (The line plot only illustrates the True predictions)
  • ...and 1 more figures