Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

Mohammad Al Ridhawi; Mahtab Haj Ali; Hussein Al Osman

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman

TL;DR

An integrated framework combining a node transformer architecture with BERT-based sentiment analysis with graph-based modeling for stock price forecasting is presented, which represents the stock market as a graph structure where individual stocks form nodes and edges capture relationships including sectoral affiliations, correlated price movements, and supply chain connections.

Abstract

Stock market prediction presents considerable challenges for investors, financial institutions, and policymakers operating in complex market environments characterized by noise, non-stationarity, and behavioral dynamics. Traditional forecasting methods often fail to capture the intricate patterns and cross-sectional dependencies inherent in financial markets. This paper presents an integrated framework combining a node transformer architecture with BERT-based sentiment analysis for stock price forecasting. The proposed model represents the stock market as a graph structure where individual stocks form nodes and edges capture relationships including sectoral affiliations, correlated price movements, and supply chain connections. A fine-tuned BERT model extracts sentiment from social media posts and combines it with quantitative market features through attention-based fusion. The node transformer processes historical market data while capturing both temporal evolution and cross-sectional dependencies among stocks. Experiments on 20 S&P 500 stocks spanning January 1982 to March 2025 demonstrate that the integrated model achieves a mean absolute percentage error (MAPE) of 0.80% for one-day-ahead predictions, compared to 1.20% for ARIMA and 1.00% for LSTM. Sentiment analysis reduces prediction error by 10% overall and 25% during earnings announcements, while graph-based modeling contributes an additional 15% improvement by capturing inter-stock dependencies. Directional accuracy reaches 65% for one-day forecasts. Statistical validation through paired t-tests confirms these improvements (p < 0.05 for all comparisons). The model maintains MAPE below 1.5% during high-volatility periods where baseline models exceed 2%.

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

TL;DR

Abstract

Paper Structure (53 sections, 33 equations, 5 figures, 10 tables)

This paper contains 53 sections, 33 equations, 5 figures, 10 tables.

Introduction
Challenges in Stock Price Forecasting
Existing Approaches
Research Motivation and Contributions
Literature Review
Statistical Methods
Classical Machine Learning
Deep Learning Methods
Convolutional Neural Networks
Recurrent Neural Networks
Graph Neural Networks
Transformer Models
Hybrid Models and Research Gaps
Datasets and Preprocessing
Financial Market Dataset
...and 38 more sections

Figures (5)

Figure 1: Feature engineering pipeline. Raw OHLCV data is processed through multiple technical indicator computations, normalized using z-score standardization, and concatenated into the final feature vector.
Figure 2: System architecture. Price data, volume, and technical indicators are jointly processed through normalization, graph construction, and temporal encoding before the node transformer. Social media posts are processed through BERT and sentiment aggregation. Both streams combine through attention-based multimodal fusion.
Figure 3: Single transformer layer architecture. Input passes through multi-head self-attention with residual connection and layer normalization, followed by a position-wise feed-forward network with another residual connection and normalization.
Figure 4: BERT sentiment extraction pipeline. Raw social media posts are preprocessed, tokenized, encoded through BERT, pooled via [CLS] token, and classified to sentiment scores.
Figure 5: Adaptive fusion mechanism. The weighting coefficient $\alpha_t$ is computed from volatility and sentiment magnitude, then used to blend node transformer and sentiment-based predictions.

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

TL;DR

Abstract

Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (5)