TartuNLP at EvaLatin 2024: Emotion Polarity Detection
Aleksei Dorkin, Kairit Sirts
TL;DR
The paper tackles emotion polarity detection in historical Latin texts by leveraging two data annotation pipelines—heuristics from a polarity lexicon and GPT-4 derived labels—combined with adapter-based, parameter-efficient fine-tuning of XLM-RoBERTa. It investigates monolingual Latin and cross-lingual English knowledge transfer through language and task adapters, and compares two submissions: heuristics-based (TartuNLP_1) and GPT-4-based (TartuNLP_2). The GPT-4 labeled data yields the overall first place, with ablation showing monolingual transfer can boost performance and cross-lingual transfer may not always help, highlighting the value and limitations of LLM-based supervision for historical languages. The work demonstrates the potential of LLM-generated annotations to improve NLP tasks in Latin and discusses directions such as multi-label framing to better capture nuanced polarity.
Abstract
This paper presents the TartuNLP team submission to EvaLatin 2024 shared task of the emotion polarity detection for historical Latin texts. Our system relies on two distinct approaches to annotating training data for supervised learning: 1) creating heuristics-based labels by adopting the polarity lexicon provided by the organizers and 2) generating labels with GPT4. We employed parameter efficient fine-tuning using the adapters framework and experimented with both monolingual and cross-lingual knowledge transfer for training language and task adapters. Our submission with the LLM-generated labels achieved the overall first place in the emotion polarity detection task. Our results show that LLM-based annotations show promising results on texts in Latin.
