Table of Contents
Fetching ...

Identification of emotions on Twitter during the 2022 electoral process in Colombia

Juan Jose Iguaran Fernandez, Juan Manuel Perez, German Rosati

TL;DR

This paper tackles emotion detection in Spanish Twitter during Colombia's 2022 presidential elections by building a 1,200-tweet corpus labeled with a fine-grained emotion taxonomy and comparing fine-tuned Spanish BERT-family models (RoBERTa, RoBERTuito, BETO) against GPT-3.5 in a few-shot setting. It employs Label Studio for multi-label annotation, groups labels into four macro-emotions (Joy, Fear, Sadness, Disgust) based on correlations, and evaluates model performance using cross-validated F1 metrics. Key findings show GPT-3.5 excels in Joy and Disgust, often outperforming fine-tuned models in negative emotions, while RoBERTa/ RoBERTuito perform robustly on Joy/Disgust but lag on Fear/Sadness; overall, LLMs demonstrate strong potential for political emotion detection in this context. The dataset and code are publicly released, underscoring the value of resources for Latin American Spanish and highlighting the need for larger, more diverse corpora and prompting strategies to optimize LLM performance in sociopolitical analyses.

Abstract

The study of Twitter as a means for analyzing social phenomena has gained interest in recent years due to the availability of large amounts of data in a relatively spontaneous environment. Within opinion-mining tasks, emotion detection is specially relevant, as it allows for the identification of people's subjective responses to different social events in a more granular way than traditional sentiment analysis based on polarity. In the particular case of political events, the analysis of emotions in social networks can provide valuable information on the perception of candidates, proposals, and other important aspects of the public debate. In spite of this importance, there are few studies on emotion detection in Spanish and, to the best of our knowledge, few resources are public for opinion mining in Colombian Spanish, highlighting the need for generating resources addressing the specific cultural characteristics of this variety. In this work, we present a small corpus of tweets in Spanish related to the 2022 Colombian presidential elections, manually labeled with emotions using a fine-grained taxonomy. We perform classification experiments using supervised state-of-the-art models (BERT models) and compare them with GPT-3.5 in few-shot learning settings. We make our dataset and code publicly available for research purposes.

Identification of emotions on Twitter during the 2022 electoral process in Colombia

TL;DR

This paper tackles emotion detection in Spanish Twitter during Colombia's 2022 presidential elections by building a 1,200-tweet corpus labeled with a fine-grained emotion taxonomy and comparing fine-tuned Spanish BERT-family models (RoBERTa, RoBERTuito, BETO) against GPT-3.5 in a few-shot setting. It employs Label Studio for multi-label annotation, groups labels into four macro-emotions (Joy, Fear, Sadness, Disgust) based on correlations, and evaluates model performance using cross-validated F1 metrics. Key findings show GPT-3.5 excels in Joy and Disgust, often outperforming fine-tuned models in negative emotions, while RoBERTa/ RoBERTuito perform robustly on Joy/Disgust but lag on Fear/Sadness; overall, LLMs demonstrate strong potential for political emotion detection in this context. The dataset and code are publicly released, underscoring the value of resources for Latin American Spanish and highlighting the need for larger, more diverse corpora and prompting strategies to optimize LLM performance in sociopolitical analyses.

Abstract

The study of Twitter as a means for analyzing social phenomena has gained interest in recent years due to the availability of large amounts of data in a relatively spontaneous environment. Within opinion-mining tasks, emotion detection is specially relevant, as it allows for the identification of people's subjective responses to different social events in a more granular way than traditional sentiment analysis based on polarity. In the particular case of political events, the analysis of emotions in social networks can provide valuable information on the perception of candidates, proposals, and other important aspects of the public debate. In spite of this importance, there are few studies on emotion detection in Spanish and, to the best of our knowledge, few resources are public for opinion mining in Colombian Spanish, highlighting the need for generating resources addressing the specific cultural characteristics of this variety. In this work, we present a small corpus of tweets in Spanish related to the 2022 Colombian presidential elections, manually labeled with emotions using a fine-grained taxonomy. We perform classification experiments using supervised state-of-the-art models (BERT models) and compare them with GPT-3.5 in few-shot learning settings. We make our dataset and code publicly available for research purposes.
Paper Structure (14 sections, 1 equation, 6 figures, 3 tables)

This paper contains 14 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Percentage of Hashtags According to Assigned Political Orientation
  • Figure 2: Percentage of tweets according to political orientation over time
  • Figure 3: Annotation workflow
  • Figure 4: Labeling interface
  • Figure 5: Correlation index between emotion labels assigned to the tweets
  • ...and 1 more figures