Table of Contents
Fetching ...

When a Nation Speaks: Machine Learning and NLP in People's Sentiment Analysis During Bangladesh's 2024 Mass Uprising

Md. Samiul Alim, Mahir Shahriar Tamim, Maisha Rahman, Tanvir Ahmed Khan, Md Mushfique Anwar

TL;DR

<3-5 sentence high-level summary> This paper addresses the lack of Bangla crisis-time sentiment analysis during civil unrest. It introduces a crisis-centric dataset of 2,028 Bangla Facebook headlines annotated with Outrage, Hope, and Despair, and employs LDA for topic modeling alongside supervised and zero-shot classifiers. The study demonstrates language-specific models (e.g., BanglaBERT) achieve leading accuracy (around 72%), with competitive traditional methods and zero-shot LLMs reaching up to 74%. The dataset and insights offer practical implications for crisis communication and policymaking, while highlighting limitations tied to source modality and taxonomy.

Abstract

Sentiment analysis, an emerging research area within natural language processing (NLP), has primarily been explored in contexts like elections and social media trends, but there remains a significant gap in understanding emotional dynamics during civil unrest, particularly in the Bangla language. Our study pioneers sentiment analysis in Bangla during a national crisis by examining public emotions amid Bangladesh's 2024 mass uprising. We curated a unique dataset of 2,028 annotated news headlines from major Facebook news portals, classifying them into Outrage, Hope, and Despair. Through Latent Dirichlet Allocation (LDA), we identified prevalent themes like political corruption and public protests, and analyzed how events such as internet blackouts shaped sentiment patterns. It outperformed multilingual transformers (mBERT: 67%, XLM-RoBERTa: 71%) and traditional machine learning methods (SVM and Logistic Regression: both 70%). These results highlight the effectiveness of language-specific models and offer valuable insights into public sentiment during political turmoil.

When a Nation Speaks: Machine Learning and NLP in People's Sentiment Analysis During Bangladesh's 2024 Mass Uprising

TL;DR

<3-5 sentence high-level summary> This paper addresses the lack of Bangla crisis-time sentiment analysis during civil unrest. It introduces a crisis-centric dataset of 2,028 Bangla Facebook headlines annotated with Outrage, Hope, and Despair, and employs LDA for topic modeling alongside supervised and zero-shot classifiers. The study demonstrates language-specific models (e.g., BanglaBERT) achieve leading accuracy (around 72%), with competitive traditional methods and zero-shot LLMs reaching up to 74%. The dataset and insights offer practical implications for crisis communication and policymaking, while highlighting limitations tied to source modality and taxonomy.

Abstract

Sentiment analysis, an emerging research area within natural language processing (NLP), has primarily been explored in contexts like elections and social media trends, but there remains a significant gap in understanding emotional dynamics during civil unrest, particularly in the Bangla language. Our study pioneers sentiment analysis in Bangla during a national crisis by examining public emotions amid Bangladesh's 2024 mass uprising. We curated a unique dataset of 2,028 annotated news headlines from major Facebook news portals, classifying them into Outrage, Hope, and Despair. Through Latent Dirichlet Allocation (LDA), we identified prevalent themes like political corruption and public protests, and analyzed how events such as internet blackouts shaped sentiment patterns. It outperformed multilingual transformers (mBERT: 67%, XLM-RoBERTa: 71%) and traditional machine learning methods (SVM and Logistic Regression: both 70%). These results highlight the effectiveness of language-specific models and offer valuable insights into public sentiment during political turmoil.

Paper Structure

This paper contains 22 sections, 7 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Workflow for sentiment analysis, involving data collection, labeling, augmentation, preprocessing, and classification into Outrage, Hope, and Despair.
  • Figure 2: Sample Comments with Sentiment Classification, Annotation, and Voting Outcome.
  • Figure 3: Word clouds for the three sentiment classes.
  • Figure 4: Class distributions before and after augmentation.
  • Figure 5: Topic Modeling using LDA to identify K topics from the dataset.
  • ...and 2 more figures