Table of Contents
Fetching ...

Optimism, Expectation, or Sarcasm? Multi-Class Hope Speech Detection in Spanish and English

Sabur Butt, Fazlourrahman Balouchzahi, Ahmad Imam Amjad, Maaz Amjad, Hector G. Ceballos, Salud Maria Jimenez-Zafra

TL;DR

This work tackles the challenge of detecting a nuanced emotion—hope—in English and Spanish text, including sarcastic manifestations. It introduces PolyHope V2, a multilingual dataset with over 30,000 annotated tweets spanning four hope subtypes (Generalized, Realistic, Unrealistic) and Sarcastic hope, with explicit sarcasm labeling. The study benchmarked fine-tuned Transformer models against zero-shot and few-shot LLMs (GPT-4 and Llama 3), finding that task-specific fine-tuning yields higher accuracy and better subtype discrimination. By providing rich annotations and cross-linguistic benchmarks, the paper contributes a valuable resource and a foundation for more semantically aware cross-lingual emotion recognition, while also highlighting persistent challenges in distinguishing closely related hope subtypes amid sarcasm.

Abstract

Hope is a complex and underexplored emotional state that plays a significant role in education, mental health, and social interaction. Unlike basic emotions, hope manifests in nuanced forms ranging from grounded optimism to exaggerated wishfulness or sarcasm, making it difficult for Natural Language Processing systems to detect accurately. This study introduces PolyHope V2, a multilingual, fine-grained hope speech dataset comprising over 30,000 annotated tweets in English and Spanish. This resource distinguishes between four hope subtypes Generalized, Realistic, Unrealistic, and Sarcastic and enhances existing datasets by explicitly labeling sarcastic instances. We benchmark multiple pretrained transformer models and compare them with large language models (LLMs) such as GPT 4 and Llama 3 under zero-shot and few-shot regimes. Our findings show that fine-tuned transformers outperform prompt-based LLMs, especially in distinguishing nuanced hope categories and sarcasm. Through qualitative analysis and confusion matrices, we highlight systematic challenges in separating closely related hope subtypes. The dataset and results provide a robust foundation for future emotion recognition tasks that demand greater semantic and contextual sensitivity across languages.

Optimism, Expectation, or Sarcasm? Multi-Class Hope Speech Detection in Spanish and English

TL;DR

This work tackles the challenge of detecting a nuanced emotion—hope—in English and Spanish text, including sarcastic manifestations. It introduces PolyHope V2, a multilingual dataset with over 30,000 annotated tweets spanning four hope subtypes (Generalized, Realistic, Unrealistic) and Sarcastic hope, with explicit sarcasm labeling. The study benchmarked fine-tuned Transformer models against zero-shot and few-shot LLMs (GPT-4 and Llama 3), finding that task-specific fine-tuning yields higher accuracy and better subtype discrimination. By providing rich annotations and cross-linguistic benchmarks, the paper contributes a valuable resource and a foundation for more semantically aware cross-lingual emotion recognition, while also highlighting persistent challenges in distinguishing closely related hope subtypes amid sarcasm.

Abstract

Hope is a complex and underexplored emotional state that plays a significant role in education, mental health, and social interaction. Unlike basic emotions, hope manifests in nuanced forms ranging from grounded optimism to exaggerated wishfulness or sarcasm, making it difficult for Natural Language Processing systems to detect accurately. This study introduces PolyHope V2, a multilingual, fine-grained hope speech dataset comprising over 30,000 annotated tweets in English and Spanish. This resource distinguishes between four hope subtypes Generalized, Realistic, Unrealistic, and Sarcastic and enhances existing datasets by explicitly labeling sarcastic instances. We benchmark multiple pretrained transformer models and compare them with large language models (LLMs) such as GPT 4 and Llama 3 under zero-shot and few-shot regimes. Our findings show that fine-tuned transformers outperform prompt-based LLMs, especially in distinguishing nuanced hope categories and sarcasm. Through qualitative analysis and confusion matrices, we highlight systematic challenges in separating closely related hope subtypes. The dataset and results provide a robust foundation for future emotion recognition tasks that demand greater semantic and contextual sensitivity across languages.

Paper Structure

This paper contains 12 sections, 2 tables.