Can LLMs Assist Annotators in Identifying Morality Frames? -- Case Study on Vaccination Debate on Social Media
Tunazzina Islam, Dan Goldwasser
TL;DR
This work investigates whether large language models (LLMs) can assist human annotators in identifying morality frames within vaccination-related social media discourse. It introduces a two-step framework that uses few-shot prompting with explanations to generate morality-frame labels and rationales, followed by human validation via a dedicated web tool and think-aloud evaluation. Empirical results show that LLMs with explanations achieve about 90.8% overall accuracy on morality-frame identification and 92.7% MF accuracy, with significant reductions in task difficulty and cognitive load for human annotators; ablation confirms explanations are crucial for performance. The study demonstrates a promising, domain-agnostic approach to human–AI collaboration in complex psycholinguistic annotation tasks and outlines pathways for broader application and bias mitigation in the future.
Abstract
Nowadays, social media is pivotal in shaping public discourse, especially on polarizing issues like vaccination, where diverse moral perspectives influence individual opinions. In NLP, data scarcity and complexity of psycholinguistic tasks, such as identifying morality frames, make relying solely on human annotators costly, time-consuming, and prone to inconsistency due to cognitive load. To address these issues, we leverage large language models (LLMs), which are adept at adapting new tasks through few-shot learning, utilizing a handful of in-context examples coupled with explanations that connect examples to task principles. Our research explores LLMs' potential to assist human annotators in identifying morality frames within vaccination debates on social media. We employ a two-step process: generating concepts and explanations with LLMs, followed by human evaluation using a "think-aloud" tool. Our study shows that integrating LLMs into the annotation process enhances accuracy, reduces task difficulty, lowers cognitive load, suggesting a promising avenue for human-AI collaboration in complex psycholinguistic tasks.
