PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent
Jiateng Liu, Lin Ai, Zizhou Liu, Payam Karisani, Zheng Hui, May Fung, Preslav Nakov, Julia Hirschberg, Heng Ji
TL;DR
This work addresses the need to move beyond merely identifying propaganda techniques to understanding the underlying motives and potential impacts. It introduces PropaInsight, a framework that decomposes propaganda into techniques, arousal appeals, and underlying intent, and PropaGaze, a dataset combining human-annotated and high-quality synthetic data to enable granular analysis and robust model training. The authors demonstrate that off-the-shelf LLMs struggle in zero-shot settings but that fine-tuning with PropaGaze substantially improves technique identification and appeal/intent analysis, with notable benefits in data-sparse and cross-domain scenarios. The approach highlights the value of synthetic data for mitigating annotation bottlenecks and enabling generalizable propaganda analysis across domains, while acknowledging limitations in dataset size and domain coverage. Overall, PropaInsight and PropaGaze offer a scalable path toward deeper, more interpretable propaganda analysis with potential applications in disinformation detection and media bias assessment.
Abstract
Propaganda plays a critical role in shaping public opinion and fueling disinformation. While existing research primarily focuses on identifying propaganda techniques, it lacks the ability to capture the broader motives and the impacts of such content. To address these challenges, we introduce propainsight, a conceptual framework grounded in foundational social science research, which systematically dissects propaganda into techniques, arousal appeals, and underlying intent. propainsight offers a more granular understanding of how propaganda operates across different contexts. Additionally, we present propagaze, a novel dataset that combines human-annotated data with high-quality synthetic data generated through a meticulously designed pipeline. Our experiments show that off-the-shelf LLMs struggle with propaganda analysis, but training with propagaze significantly improves performance. Fine-tuned Llama-7B-Chat achieves 203.4% higher text span IoU in technique identification and 66.2% higher BertScore in appeal analysis compared to 1-shot GPT-4-Turbo. Moreover, propagaze complements limited human-annotated data in data-sparse and cross-domain scenarios, showing its potential for comprehensive and generalizable propaganda analysis.
