Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish
Kevin Cohen, Laura Manrique-Gómez, Rubén Manrique
TL;DR
The paper tackles irony detection in 19th-century Spanish-language journalism by building a historical sentiment-irony dataset (LatamXIX) and evaluating two data-centric strategies with BERT-based models and GPT-4o. It first tests emotion/context-focused text enhancement and then a semi-automatic annotation workflow to augment data, finding that enhancement alone offers limited gains while augmentation improves ironical content detection and balances class distribution. The study introduces a sizable historical dataset and a human-in-the-loop annotation method, demonstrating how domain context and expert verification can substantially improve model performance for complex figurative language. This work advances historical NLP and sentiment analysis by提供ing a scalable, hybrid approach that combines LLM capabilities with human expertise to better capture irony in culturally rich, historical corpora.
Abstract
This study explores the use of large language models (LLMs) to enhance datasets and improve irony detection in 19th-century Latin American newspapers. Two strategies were employed to evaluate the efficacy of BERT and GPT-4o models in capturing the subtle nuances nature of irony, through both multi-class and binary classification tasks. First, we implemented dataset enhancements focused on enriching emotional and contextual cues; however, these showed limited impact on historical language analysis. The second strategy, a semi-automated annotation process, effectively addressed class imbalance and augmented the dataset with high-quality annotations. Despite the challenges posed by the complexity of irony, this work contributes to the advancement of sentiment analysis through two key contributions: introducing a new historical Spanish dataset tagged for sentiment analysis and irony detection, and proposing a semi-automated annotation methodology where human expertise is crucial for refining LLMs results, enriched by incorporating historical and cultural contexts as core features.
