Table of Contents
Fetching ...

The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content

Xinyu Wang, Sai Koneru, Pranav Narayanan Venkit, Brett Frischmann, Sarah Rajtmajer

TL;DR

The paper addresses the gap between social media moderation policies that hinge on user intent and the current ability of automated detection systems to capture intent from short text. It surveys taxonomies, policies, datasets, and detection algorithms, highlighting how context and intent are often underrepresented in both annotation guidelines and modeling features. The authors propose context-aware, policy-aligned, and explainable detection models, augmented by friction-design and crowd-sourced moderation feedback to align automated tools with evolving social norms and legal frameworks. The work emphasizes robust, context-rich datasets and governance-centric system design to improve ethical alignment, transparency, and practical effectiveness in large-scale moderation. Its findings advocate for interdisciplinary collaboration across NLP, cognitive science, ethics, and policy to realize robust intent-aware moderation in dynamic online ecosystems.

Abstract

As social media has become a predominant mode of communication globally, the rise of abusive content threatens to undermine civil discourse. Recognizing the critical nature of this issue, a significant body of research has been dedicated to developing language models that can detect various types of online abuse, e.g., hate speech, cyberbullying. However, there exists a notable disconnect between platform policies, which often consider the author's intention as a criterion for content moderation, and the current capabilities of detection models, which typically lack efforts to capture intent. This paper examines the role of intent in content moderation systems. We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent. We propose strategic changes to the design and development of automated detection and moderation systems to improve alignment with ethical and policy conceptualizations of abuse.

The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content

TL;DR

The paper addresses the gap between social media moderation policies that hinge on user intent and the current ability of automated detection systems to capture intent from short text. It surveys taxonomies, policies, datasets, and detection algorithms, highlighting how context and intent are often underrepresented in both annotation guidelines and modeling features. The authors propose context-aware, policy-aligned, and explainable detection models, augmented by friction-design and crowd-sourced moderation feedback to align automated tools with evolving social norms and legal frameworks. The work emphasizes robust, context-rich datasets and governance-centric system design to improve ethical alignment, transparency, and practical effectiveness in large-scale moderation. Its findings advocate for interdisciplinary collaboration across NLP, cognitive science, ethics, and policy to realize robust intent-aware moderation in dynamic online ecosystems.

Abstract

As social media has become a predominant mode of communication globally, the rise of abusive content threatens to undermine civil discourse. Recognizing the critical nature of this issue, a significant body of research has been dedicated to developing language models that can detect various types of online abuse, e.g., hate speech, cyberbullying. However, there exists a notable disconnect between platform policies, which often consider the author's intention as a criterion for content moderation, and the current capabilities of detection models, which typically lack efforts to capture intent. This paper examines the role of intent in content moderation systems. We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent. We propose strategic changes to the design and development of automated detection and moderation systems to improve alignment with ethical and policy conceptualizations of abuse.
Paper Structure (26 sections, 4 figures, 2 tables)

This paper contains 26 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Distribution of papers based on features utilized by the detection models.
  • Figure 2: Example of prompts and detection model performance to showcase the importance of context and intent in understanding online abuse.
  • Figure 3: PRISMA diagram for the selection of papers presenting labeled datasets for online abuse.
  • Figure 4: PRISMA diagram for the selection of papers presenting detection algorithms for online abuse.