Table of Contents
Fetching ...

Predicting Stock Price Movement with LLM-Enhanced Tweet Emotion Analysis

An Vuong, Susan Gauch

TL;DR

The paper tackles the challenge of predicting short-term stock price movements by augmenting historical price data with emotion features derived from tweets. It introduces a three-part framework: Llama 3.1-8B-Instruct-based tweet preprocessing, three emotion-analysis methods (DistilRoBERTa, NRC-Intensity, NRC-Lexicon), and an LSTM trained on prior-day data to classify next-day movement into Stable, Significant Increase, or Significant Decrease. Empirical results on TSLA, AAPL, and AMZN show that incorporating emotion features improves predictive accuracy over a price-only baseline (13.5%), with DistilRoBERTa achieving the best performance (up to 38.5% average accuracy when using Llama-enhanced emotion analysis). The findings demonstrate the value of large language model preprocessing in enhancing emotion features for financial forecasting, and highlight practical potential for emotion-informed trading signals, while suggesting avenues for future work that incorporate technical indicators and financial news signals.

Abstract

Accurately predicting short-term stock price movement remains a challenging task due to the market's inherent volatility and sensitivity to investor sentiment. This paper discusses a deep learning framework that integrates emotion features extracted from tweet data with historical stock price information to forecast significant price changes on the following day. We utilize Meta's Llama 3.1-8B-Instruct model to preprocess tweet data, thereby enhancing the quality of emotion features derived from three emotion analysis approaches: a transformer-based DistilRoBERTa classifier from the Hugging Face library and two lexicon-based methods using National Research Council Canada (NRC) resources. These features are combined with previous-day stock price data to train a Long Short-Term Memory (LSTM) model. Experimental results on TSLA, AAPL, and AMZN stocks show that all three emotion analysis methods improve the average accuracy for predicting significant price movements, compared to the baseline model using only historical stock prices, which yields an accuracy of 13.5%. The DistilRoBERTa-based stock prediction model achieves the best performance, with accuracy rising from 23.6% to 38.5% when using LLaMA-enhanced emotion analysis. These results demonstrate that using large language models to preprocess tweet content enhances the effectiveness of emotion analysis which in turn improves the accuracy of predicting significant stock price movements.

Predicting Stock Price Movement with LLM-Enhanced Tweet Emotion Analysis

TL;DR

The paper tackles the challenge of predicting short-term stock price movements by augmenting historical price data with emotion features derived from tweets. It introduces a three-part framework: Llama 3.1-8B-Instruct-based tweet preprocessing, three emotion-analysis methods (DistilRoBERTa, NRC-Intensity, NRC-Lexicon), and an LSTM trained on prior-day data to classify next-day movement into Stable, Significant Increase, or Significant Decrease. Empirical results on TSLA, AAPL, and AMZN show that incorporating emotion features improves predictive accuracy over a price-only baseline (13.5%), with DistilRoBERTa achieving the best performance (up to 38.5% average accuracy when using Llama-enhanced emotion analysis). The findings demonstrate the value of large language model preprocessing in enhancing emotion features for financial forecasting, and highlight practical potential for emotion-informed trading signals, while suggesting avenues for future work that incorporate technical indicators and financial news signals.

Abstract

Accurately predicting short-term stock price movement remains a challenging task due to the market's inherent volatility and sensitivity to investor sentiment. This paper discusses a deep learning framework that integrates emotion features extracted from tweet data with historical stock price information to forecast significant price changes on the following day. We utilize Meta's Llama 3.1-8B-Instruct model to preprocess tweet data, thereby enhancing the quality of emotion features derived from three emotion analysis approaches: a transformer-based DistilRoBERTa classifier from the Hugging Face library and two lexicon-based methods using National Research Council Canada (NRC) resources. These features are combined with previous-day stock price data to train a Long Short-Term Memory (LSTM) model. Experimental results on TSLA, AAPL, and AMZN stocks show that all three emotion analysis methods improve the average accuracy for predicting significant price movements, compared to the baseline model using only historical stock prices, which yields an accuracy of 13.5%. The DistilRoBERTa-based stock prediction model achieves the best performance, with accuracy rising from 23.6% to 38.5% when using LLaMA-enhanced emotion analysis. These results demonstrate that using large language models to preprocess tweet content enhances the effectiveness of emotion analysis which in turn improves the accuracy of predicting significant stock price movements.

Paper Structure

This paper contains 11 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The overall pipeline for predicting significant stock price movements
  • Figure 2: Distribution of extracted emotions from tweets using three different emotion analysis methods
  • Figure 3: Class distribution of daily significant stock price movements for TSLA, AAPL, and AMZN
  • Figure 4: Overall Average Accuracy for Predicting Significant Increase and Decrease Movements Across 3 Stocks