Zero-Shot Stance Detection using Contextual Data Generation with LLMs
Ghazaleh Mahmoudi, Babak Behkamkia, Sauleh Eetemadi
TL;DR
This work tackles zero-shot stance detection under data scarcity by proposing DyMoAdapt, a test-time fine-tuning pipeline that uses GPT-3 to generate topic-specific synthetic data for adapting models to unseen topics. It also introduces MGT-VAST, an expanded dataset in which each context is paired with multiple topics, generated to model topic-context relationships. Experiments show that BERT and GPT-3 achieve strong performance on MGT-VAST, with DyMoAdapt providing notable improvements on SEMEval2016-T6 for certain settings but facing limitations with neutral data. The study highlights both the potential and the constraints of LLM-driven data augmentation for real-time topic adaptation in stance detection, pointing to future work on improving neutral-data generation and exploring alternative augmentation methods.
Abstract
Stance detection, the classification of attitudes expressed in a text towards a specific topic, is vital for applications like fake news detection and opinion mining. However, the scarcity of labeled data remains a challenge for this task. To address this problem, we propose Dynamic Model Adaptation with Contextual Data Generation (DyMoAdapt) that combines Few-Shot Learning and Large Language Models. In this approach, we aim to fine-tune an existing model at test time. We achieve this by generating new topic-specific data using GPT-3. This method could enhance performance by allowing the adaptation of the model to new topics. However, the results did not increase as we expected. Furthermore, we introduce the Multi Generated Topic VAST (MGT-VAST) dataset, which extends VAST using GPT-3. In this dataset, each context is associated with multiple topics, allowing the model to understand the relationship between contexts and various potential topics
