Table of Contents
Fetching ...

CPL-NoViD: Context-Aware Prompt-based Learning for Norm Violation Detection in Online Communities

Zihao He, Jonathan May, Kristina Lerman

TL;DR

CPL-NoViD introduces a context-aware prompt-based framework for norm violation detection in online communities, framing each detection as a natural-language prompt that embeds subreddit rules and preceding conversation context. By integrating context directly into prompts and leveraging a pretrained language model without introducing extra context-encoder parameters, CPL-NoViD achieves state-of-the-art macro-F1 performance and robust generalization to unseen rule types and communities, while performing well in few-shot settings. Extensive experiments on the NORMVIO Reddit dataset demonstrate CPL-NoViD’s superior cross-rule-type and cross-community generalizability, with ablation studies confirming the benefit of context and illustrating how thread length affects performance. The work highlights the practical potential of prompt-based, context-aware models for scalable, adaptable content moderation, and discusses ethical considerations, limitations, and future directions for more nuanced and responsible deployment.

Abstract

Detecting norm violations in online communities is critical to maintaining healthy and safe spaces for online discussions. Existing machine learning approaches often struggle to adapt to the diverse rules and interpretations across different communities due to the inherent challenges of fine-tuning models for such context-specific tasks. In this paper, we introduce Context-aware Prompt-based Learning for Norm Violation Detection (CPL-NoViD), a novel method that employs prompt-based learning to detect norm violations across various types of rules. CPL-NoViD outperforms the baseline by incorporating context through natural language prompts and demonstrates improved performance across different rule types. Significantly, it not only excels in cross-rule-type and cross-community norm violation detection but also exhibits adaptability in few-shot learning scenarios. Most notably, it establishes a new state-of-the-art in norm violation detection, surpassing existing benchmarks. Our work highlights the potential of prompt-based learning for context-sensitive norm violation detection and paves the way for future research on more adaptable, context-aware models to better support online community moderators.

CPL-NoViD: Context-Aware Prompt-based Learning for Norm Violation Detection in Online Communities

TL;DR

CPL-NoViD introduces a context-aware prompt-based framework for norm violation detection in online communities, framing each detection as a natural-language prompt that embeds subreddit rules and preceding conversation context. By integrating context directly into prompts and leveraging a pretrained language model without introducing extra context-encoder parameters, CPL-NoViD achieves state-of-the-art macro-F1 performance and robust generalization to unseen rule types and communities, while performing well in few-shot settings. Extensive experiments on the NORMVIO Reddit dataset demonstrate CPL-NoViD’s superior cross-rule-type and cross-community generalizability, with ablation studies confirming the benefit of context and illustrating how thread length affects performance. The work highlights the practical potential of prompt-based, context-aware models for scalable, adaptable content moderation, and discusses ethical considerations, limitations, and future directions for more nuanced and responsible deployment.

Abstract

Detecting norm violations in online communities is critical to maintaining healthy and safe spaces for online discussions. Existing machine learning approaches often struggle to adapt to the diverse rules and interpretations across different communities due to the inherent challenges of fine-tuning models for such context-specific tasks. In this paper, we introduce Context-aware Prompt-based Learning for Norm Violation Detection (CPL-NoViD), a novel method that employs prompt-based learning to detect norm violations across various types of rules. CPL-NoViD outperforms the baseline by incorporating context through natural language prompts and demonstrates improved performance across different rule types. Significantly, it not only excels in cross-rule-type and cross-community norm violation detection but also exhibits adaptability in few-shot learning scenarios. Most notably, it establishes a new state-of-the-art in norm violation detection, surpassing existing benchmarks. Our work highlights the potential of prompt-based learning for context-sensitive norm violation detection and paves the way for future research on more adaptable, context-aware models to better support online community moderators.
Paper Structure (29 sections, 3 figures, 6 tables)

This paper contains 29 sections, 3 figures, 6 tables.

Figures (3)

  • Figure 1: The architecture of CPL-NoViD. It is designed for prompt-based learning and integrates the conversation context into the prompt for a holistic understanding.
  • Figure 2: Sector diagram of different rule types in NORMVIO. The diagram is adapted from park2021detecting.
  • Figure 3: Distribution of number of conversations in each subreddit. The distribution is heavily long-tailed.