Table of Contents
Fetching ...

KInITVeraAI at SemEval-2023 Task 3: Simple yet Powerful Multilingual Fine-Tuning for Persuasion Techniques Detection

Timo Hromadka, Timotej Smolen, Tomas Remis, Branislav Pecher, Ivan Srba

TL;DR

This work addresses multi-label persuasion techniques detection across six languages, including three surprise languages, by fine-tuning a single multilingual transformer (XLM-RoBERTa large) and calibrating confidence thresholds for seen and unseen languages. Through a systematic comparison of monolingual translations, multilingual models, and their ensemble, the authors find that a single multilingual model with calibrated thresholds most effectively handles data imbalance and zero-shot generalization. Key findings show large multilingual models outperform translation-based monolinguals, and threshold calibration is crucial for maximizing F1 micro, with a minor role for preprocessing and a negative impact from layer freezing. The approach achieves top performance on multiple languages and demonstrates practical value for multilingual persuasion-detection tasks, with potential extensions to prompting and in-context learning.

Abstract

This paper presents the best-performing solution to the SemEval 2023 Task 3 on the subtask 3 dedicated to persuasion techniques detection. Due to a high multilingual character of the input data and a large number of 23 predicted labels (causing a lack of labelled data for some language-label combinations), we opted for fine-tuning pre-trained transformer-based language models. Conducting multiple experiments, we find the best configuration, which consists of large multilingual model (XLM-RoBERTa large) trained jointly on all input data, with carefully calibrated confidence thresholds for seen and surprise languages separately. Our final system performed the best on 6 out of 9 languages (including two surprise languages) and achieved highly competitive results on the remaining three languages.

KInITVeraAI at SemEval-2023 Task 3: Simple yet Powerful Multilingual Fine-Tuning for Persuasion Techniques Detection

TL;DR

This work addresses multi-label persuasion techniques detection across six languages, including three surprise languages, by fine-tuning a single multilingual transformer (XLM-RoBERTa large) and calibrating confidence thresholds for seen and unseen languages. Through a systematic comparison of monolingual translations, multilingual models, and their ensemble, the authors find that a single multilingual model with calibrated thresholds most effectively handles data imbalance and zero-shot generalization. Key findings show large multilingual models outperform translation-based monolinguals, and threshold calibration is crucial for maximizing F1 micro, with a minor role for preprocessing and a negative impact from layer freezing. The approach achieves top performance on multiple languages and demonstrates practical value for multilingual persuasion-detection tasks, with potential extensions to prompting and in-context learning.

Abstract

This paper presents the best-performing solution to the SemEval 2023 Task 3 on the subtask 3 dedicated to persuasion techniques detection. Due to a high multilingual character of the input data and a large number of 23 predicted labels (causing a lack of labelled data for some language-label combinations), we opted for fine-tuning pre-trained transformer-based language models. Conducting multiple experiments, we find the best configuration, which consists of large multilingual model (XLM-RoBERTa large) trained jointly on all input data, with carefully calibrated confidence thresholds for seen and surprise languages separately. Our final system performed the best on 6 out of 9 languages (including two surprise languages) and achieved highly competitive results on the remaining three languages.
Paper Structure (15 sections, 4 figures, 2 tables)

This paper contains 15 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: We perform multiple experiments to determine the best configuration for our solution. The performance-improving approaches (denoted in bold) are used in the final solution.
  • Figure 2: Results (aggregated across languages) from calibrating the confidence threshold for the XLM-RoBERTa (large) model for both the default setting and zero-shot setting.
  • Figure 3: Distribution of labels in the available data sets per language and persuasion technique.
  • Figure 4: Distribution of labels in the available data sets per language and persuasion technique.