Table of Contents
Fetching ...

PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent

Jiateng Liu, Lin Ai, Zizhou Liu, Payam Karisani, Zheng Hui, May Fung, Preslav Nakov, Julia Hirschberg, Heng Ji

TL;DR

This work addresses the need to move beyond merely identifying propaganda techniques to understanding the underlying motives and potential impacts. It introduces PropaInsight, a framework that decomposes propaganda into techniques, arousal appeals, and underlying intent, and PropaGaze, a dataset combining human-annotated and high-quality synthetic data to enable granular analysis and robust model training. The authors demonstrate that off-the-shelf LLMs struggle in zero-shot settings but that fine-tuning with PropaGaze substantially improves technique identification and appeal/intent analysis, with notable benefits in data-sparse and cross-domain scenarios. The approach highlights the value of synthetic data for mitigating annotation bottlenecks and enabling generalizable propaganda analysis across domains, while acknowledging limitations in dataset size and domain coverage. Overall, PropaInsight and PropaGaze offer a scalable path toward deeper, more interpretable propaganda analysis with potential applications in disinformation detection and media bias assessment.

Abstract

Propaganda plays a critical role in shaping public opinion and fueling disinformation. While existing research primarily focuses on identifying propaganda techniques, it lacks the ability to capture the broader motives and the impacts of such content. To address these challenges, we introduce propainsight, a conceptual framework grounded in foundational social science research, which systematically dissects propaganda into techniques, arousal appeals, and underlying intent. propainsight offers a more granular understanding of how propaganda operates across different contexts. Additionally, we present propagaze, a novel dataset that combines human-annotated data with high-quality synthetic data generated through a meticulously designed pipeline. Our experiments show that off-the-shelf LLMs struggle with propaganda analysis, but training with propagaze significantly improves performance. Fine-tuned Llama-7B-Chat achieves 203.4% higher text span IoU in technique identification and 66.2% higher BertScore in appeal analysis compared to 1-shot GPT-4-Turbo. Moreover, propagaze complements limited human-annotated data in data-sparse and cross-domain scenarios, showing its potential for comprehensive and generalizable propaganda analysis.

PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent

TL;DR

This work addresses the need to move beyond merely identifying propaganda techniques to understanding the underlying motives and potential impacts. It introduces PropaInsight, a framework that decomposes propaganda into techniques, arousal appeals, and underlying intent, and PropaGaze, a dataset combining human-annotated and high-quality synthetic data to enable granular analysis and robust model training. The authors demonstrate that off-the-shelf LLMs struggle in zero-shot settings but that fine-tuning with PropaGaze substantially improves technique identification and appeal/intent analysis, with notable benefits in data-sparse and cross-domain scenarios. The approach highlights the value of synthetic data for mitigating annotation bottlenecks and enabling generalizable propaganda analysis across domains, while acknowledging limitations in dataset size and domain coverage. Overall, PropaInsight and PropaGaze offer a scalable path toward deeper, more interpretable propaganda analysis with potential applications in disinformation detection and media bias assessment.

Abstract

Propaganda plays a critical role in shaping public opinion and fueling disinformation. While existing research primarily focuses on identifying propaganda techniques, it lacks the ability to capture the broader motives and the impacts of such content. To address these challenges, we introduce propainsight, a conceptual framework grounded in foundational social science research, which systematically dissects propaganda into techniques, arousal appeals, and underlying intent. propainsight offers a more granular understanding of how propaganda operates across different contexts. Additionally, we present propagaze, a novel dataset that combines human-annotated data with high-quality synthetic data generated through a meticulously designed pipeline. Our experiments show that off-the-shelf LLMs struggle with propaganda analysis, but training with propagaze significantly improves performance. Fine-tuned Llama-7B-Chat achieves 203.4% higher text span IoU in technique identification and 66.2% higher BertScore in appeal analysis compared to 1-shot GPT-4-Turbo. Moreover, propagaze complements limited human-annotated data in data-sparse and cross-domain scenarios, showing its potential for comprehensive and generalizable propaganda analysis.
Paper Structure (57 sections, 4 figures, 6 tables)

This paper contains 57 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: We abstract key elements of propaganda from social science literature. A comprehensive propaganda frame includes the techniques employed, the appeals evoked in readers, and the author's underlying intent.
  • Figure 2: Partially controlled data generation pipeline: We first collect real-world news articles and derive an objective summary to extract events. Then we generate event-based intent, and randomly sample specific propaganda techniques to insert into the event descriptions. Lastly, we generate appeals from a reader's perspective, aiming at making the appeals grounded to the text.
  • Figure 3: The user interface we used in Label Studio to annotate intent based on a given article.
  • Figure 4: The user interface we used in Label Studio to annotate appeals based on a context. The highlighted part will be the sentence to be annotated, while other parts of 'Target Article' provide related context.