IAI Group at CheckThat! 2024: Transformer Models and Data Augmentation for Checkworthy Claim Detection
Peter Røysland Aarnes, Vinay Setty, Petra Galuščáková
TL;DR
The paper tackles automated check-worthiness estimation for English, Dutch, and Arabic in CheckThat! 2024, leveraging transformer-based encoders and decoders, data augmentation, and cross-language transfer with few-shot chain-of-thought reasoning. It systematically compares language-specific fine-tuning strategies across RoBERTa, XLM-RoBERTa, GPT-4, Mistral-7b, and GPT-3.5, including translation-based augmentation and style-transfer experiments. The Arabic results top the leaderboard, Dutch perform competitively, and English lags behind, with notable generalization gaps between development and test sets that reveal distribution shifts and overfitting risks. The work highlights the potential and limitations of language-specific adaptations for check-worthiness detection and informs future directions in robust cross-lingual claim verification under real-world data distributions.
Abstract
This paper describes IAI group's participation for automated check-worthiness estimation for claims, within the framework of the 2024 CheckThat! Lab "Task 1: Check-Worthiness Estimation". The task involves the automated detection of check-worthy claims in English, Dutch, and Arabic political debates and Twitter data. We utilized various pre-trained generative decoder and encoder transformer models, employing methods such as few-shot chain-of-thought reasoning, fine-tuning, data augmentation, and transfer learning from one language to another. Despite variable success in terms of performance, our models achieved notable placements on the organizer's leaderboard: ninth-best in English, third-best in Dutch, and the top placement in Arabic, utilizing multilingual datasets for enhancing the generalizability of check-worthiness detection. Despite a significant drop in performance on the unlabeled test dataset compared to the development test dataset, our findings contribute to the ongoing efforts in claim detection research, highlighting the challenges and potential of language-specific adaptations in claim verification systems.
