Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA
Marek Šuppa, Daniel Skala, Daniela Jašš, Samuel Sučík, Andrej Švec, Peter Hraška
TL;DR
This work assesses whether retrieval-augmented GPT-4 can effectively perform climate activism stance, target, and hate event detection without fine-tuning. By constructing task-specific prompts, augmenting with a Chroma-based retrieval system, and applying a RankT5-based re-ranking step, the approach achieves competitive results, including second place in Subtask B. An ablation with LLaMA 2 70B shows GPT-4's superiority on the evaluated metric (F1) and underscores the value of retrieval augmentation, while also revealing limitations related to dataset labeling and reproducibility due to commercial model access. The study demonstrates the practical potential of LLM-based classification for multi-aspect social-media analysis in climate activism, along with a publicly available submission codebase.
Abstract
This study details our approach for the CASE 2024 Shared Task on Climate Activism Stance and Hate Event Detection, focusing on Hate Speech Detection, Hate Speech Target Identification, and Stance Detection as classification challenges. We explored the capability of Large Language Models (LLMs), particularly GPT-4, in zero- or few-shot settings enhanced by retrieval augmentation and re-ranking for Tweet classification. Our goal was to determine if LLMs could match or surpass traditional methods in this context. We conducted an ablation study with LLaMA for comparison, and our results indicate that our models significantly outperformed the baselines, securing second place in the Target Detection task. The code for our submission is available at https://github.com/NaiveNeuron/bryndza-case-2024
