Table of Contents
Fetching ...

Measuring and Forecasting Conversation Incivility: the Role of Antisocial and Prosocial Behaviors

Xinchen Yu, Hayden Arnold, Benjamin Su, Eduardo Blanco

TL;DR

The paper presents a multi-dimensional framework for measuring and forecasting conversation incivility after hate speech, coupling antisocial and prosocial behaviors across four dimensions each. It introduces a data-driven incivility metric S(r) that blends antisocial, prosocial, and neutral signals while incorporating per-user weighting, and validates it against human judgments, achieving substantial agreement. Empirical analyses on a large Reddit dataset reveal distinct linguistic patterns for antisocial and prosocial posts and show that antisocial content more often prompts re-engagement, though prosocial content is prevalent in longer, multi-turn discussions. Supervised models using hate-post and reply context outperform GPT-4o prompting in predicting incivility levels, with pretraining and data blending providing additional gains; error analysis highlights challenges such as sarcasm and negation. Overall, the work provides a robust, scalable approach to anticipate escalation in online discourse and informs moderation strategies to promote civil engagement while acknowledging methodological and domain limitations.

Abstract

This paper focuses on the task of measuring and forecasting incivility in conversations following replies to hate speech. Identifying replies that steer conversations away from hatred and elicit civil follow-up conversations sheds light into effective strategies to engage with hate speech and proactively avoid further escalation. We propose new metrics that take into account various dimensions of antisocial and prosocial behaviors to measure the conversation incivility following replies to hate speech. Our best metric aligns with human perceptions better than prior work. Additionally, we present analyses on a) the language of antisocial and prosocial posts, b) the relationship between antisocial or prosocial posts and user interactions, and c) the language of replies to hate speech that elicit follow-up conversations with different incivility levels. We show that forecasting the incivility level of conversations following a reply to hate speech is a challenging task. We also present qualitative analyses to identify the most common errors made by our best model.

Measuring and Forecasting Conversation Incivility: the Role of Antisocial and Prosocial Behaviors

TL;DR

The paper presents a multi-dimensional framework for measuring and forecasting conversation incivility after hate speech, coupling antisocial and prosocial behaviors across four dimensions each. It introduces a data-driven incivility metric S(r) that blends antisocial, prosocial, and neutral signals while incorporating per-user weighting, and validates it against human judgments, achieving substantial agreement. Empirical analyses on a large Reddit dataset reveal distinct linguistic patterns for antisocial and prosocial posts and show that antisocial content more often prompts re-engagement, though prosocial content is prevalent in longer, multi-turn discussions. Supervised models using hate-post and reply context outperform GPT-4o prompting in predicting incivility levels, with pretraining and data blending providing additional gains; error analysis highlights challenges such as sarcasm and negation. Overall, the work provides a robust, scalable approach to anticipate escalation in online discourse and informs moderation strategies to promote civil engagement while acknowledging methodological and domain limitations.

Abstract

This paper focuses on the task of measuring and forecasting incivility in conversations following replies to hate speech. Identifying replies that steer conversations away from hatred and elicit civil follow-up conversations sheds light into effective strategies to engage with hate speech and proactively avoid further escalation. We propose new metrics that take into account various dimensions of antisocial and prosocial behaviors to measure the conversation incivility following replies to hate speech. Our best metric aligns with human perceptions better than prior work. Additionally, we present analyses on a) the language of antisocial and prosocial posts, b) the relationship between antisocial or prosocial posts and user interactions, and c) the language of replies to hate speech that elicit follow-up conversations with different incivility levels. We show that forecasting the incivility level of conversations following a reply to hate speech is a challenging task. We also present qualitative analyses to identify the most common errors made by our best model.

Paper Structure

This paper contains 31 sections, 1 equation, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Hateful Reddit post (top), three direct replies, and the follow-up conversations. The first reply steers the follow-up conversation towards civil behaviors. The second reply elicits additional incivility. The last reply is generated by ChatGPT. It does not address the hateful post directly or elicit a follow-up conversation.
  • Figure 2: Spearman's rank correlation coefficients between antisocial and prosocial behaviors. $a_i$ and $p_i$ indicate antisocial and prosocial behaviors in the order described in the paper. All coefficients are low ($< 0.3$) except the one between $a_2$ (explicit hate speech) and $a_3$ (abusive language).
  • Figure 3: Illustration of our metric to estimate incivility of the conversation following a reply $r$ to a hateful post ($S(r)$). We calculate four antisocial and four prosocial behaviors for each post. The metric consists of components to account for antisocial ($A(r)$), prosocial ($P(r)$), and neutral behaviors ($N(r)$); and considers not only the amount of each behavior but also whether different authors generate posts with the same behavior (not shown).
  • Figure 4: Cohen's $\kappa$ coefficients between human annotations (after adjudication) and using our metric to determine which of two conversations is more uncivil. Cells indicate the highest $\kappa$ obtained with the corresponding combination of antisocial and prosocial behaviors after trying all combinations of $\alpha$ and $\beta$. Prosocial behaviors by themselves underperform, and combining several antisocial behaviors outperforms individual antisocial behaviors. The optimal choice is $(a_1, a_2, a_3)$ and $(p_3)$.
  • Figure 5: Template to generate zero-shot prompts for GPT-4o.
  • ...and 1 more figures