Table of Contents
Fetching ...

Mining Tweets to Predict Future Bitcoin Price

Ashutosh Hathidara, Gaurav Atavale, Suyash Chaudhary

TL;DR

This paper investigates whether Twitter chatter about Bitcoin can predict next-day price by constructing day-level feature vectors from a large English-filtered tweet dataset and applying regression and classification models. It demonstrates that sentiment alone has limited predictive power, but richer tweet-derived features and standard predictive pipelines can yield moderate forecasts. The study compares multiple modeling approaches, highlighting ridge/linear regression as strong regressors and Random Forest as a competitive classifier, while clustering methods show limited reliability under data size constraints. The work offers a pathway for incorporating social-media signals into crypto price forecasting and suggests extensions to other cryptocurrencies and additional data sources.

Abstract

Bitcoin has increased investment interests in people during the last decade. We have seen an increase in the number of posts on social media platforms about cryptocurrency, especially Bitcoin. This project focuses on analyzing user tweet data in combination with Bitcoin price data to see the relevance between price fluctuations and the conversation between millions of people on Twitter. This study also exploits this relationship between user tweets and bitcoin prices to predict the future bitcoin price. We are utilizing novel techniques and methods to analyze the data and make price predictions.

Mining Tweets to Predict Future Bitcoin Price

TL;DR

This paper investigates whether Twitter chatter about Bitcoin can predict next-day price by constructing day-level feature vectors from a large English-filtered tweet dataset and applying regression and classification models. It demonstrates that sentiment alone has limited predictive power, but richer tweet-derived features and standard predictive pipelines can yield moderate forecasts. The study compares multiple modeling approaches, highlighting ridge/linear regression as strong regressors and Random Forest as a competitive classifier, while clustering methods show limited reliability under data size constraints. The work offers a pathway for incorporating social-media signals into crypto price forecasting and suggests extensions to other cryptocurrencies and additional data sources.

Abstract

Bitcoin has increased investment interests in people during the last decade. We have seen an increase in the number of posts on social media platforms about cryptocurrency, especially Bitcoin. This project focuses on analyzing user tweet data in combination with Bitcoin price data to see the relevance between price fluctuations and the conversation between millions of people on Twitter. This study also exploits this relationship between user tweets and bitcoin prices to predict the future bitcoin price. We are utilizing novel techniques and methods to analyze the data and make price predictions.

Paper Structure

This paper contains 9 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Raw data extracted from Kaggle
  • Figure 2: Results extracted from EDA
  • Figure 3: Results extracted from Sentiment Analysis
  • Figure 4: K-Means clustering analysis
  • Figure 5: Results extracted from Regression Analysis
  • ...and 1 more figures