Community Fact-Checks Trigger Moral Outrage in Replies to Misleading Posts on Social Media

Yuwei Chuai; Anastasia Sergeeva; Gabriele Lenzini; Nicolas Pröllochs

Community Fact-Checks Trigger Moral Outrage in Replies to Misleading Posts on Social Media

Yuwei Chuai, Anastasia Sergeeva, Gabriele Lenzini, Nicolas Pröllochs

TL;DR

This study investigates how displaying community fact-checks (Community Notes) on misleading posts alters users’ emotional responses in replies. Using a large panel of $N=2{,}225{,}260$ replies to $1841$ source posts and a Regression Discontinuity in Time design, the authors quantify changes in positivity/negativity and Ekman-based emotions, including moral outrage. They find that note display increases negativity by $7.3 ightarrow7.3 ext{ %}$, anger by $13.2 ext{ %}$, disgust by $4.7 ext{ %}$, and moral outrage by $16 ext{ %}$, with larger effects for political posts; positive sentiment declines, and surprise also drops. Robustness checks across bandwidths, lexicons, and classifiers support the results, which have important implications for the design of crowd-based fact-checking and for understanding potential polarization effects in online discourse. Overall, the paper highlights a nuanced trade-off: community fact-checks can reduce misinformation engagement but may provoke moral emotions that shape the tone and dynamics of discussion.

Abstract

Displaying community fact-checks is a promising approach to reduce engagement with misinformation on social media. However, how users respond to misleading content emotionally after community fact-checks are displayed on posts is unclear. Here, we employ quasi-experimental methods to causally analyze changes in sentiments and (moral) emotions in replies to misleading posts following the display of community fact-checks. Our evaluation is based on a large-scale panel dataset comprising N=2,225,260 replies across 1841 source posts from X's Community Notes platform. We find that informing users about falsehoods through community fact-checks significantly increases negativity (by 7.3%), anger (by 13.2%), disgust (by 4.7%), and moral outrage (by 16.0%) in the corresponding replies. These results indicate that users perceive spreading misinformation as a violation of social norms and that those who spread misinformation should expect negative reactions once their content is debunked. We derive important implications for the design of community-based fact-checking systems.

Community Fact-Checks Trigger Moral Outrage in Replies to Misleading Posts on Social Media

TL;DR

This study investigates how displaying community fact-checks (Community Notes) on misleading posts alters users’ emotional responses in replies. Using a large panel of

replies to

source posts and a Regression Discontinuity in Time design, the authors quantify changes in positivity/negativity and Ekman-based emotions, including moral outrage. They find that note display increases negativity by

, anger by

, disgust by

, and moral outrage by

, with larger effects for political posts; positive sentiment declines, and surprise also drops. Robustness checks across bandwidths, lexicons, and classifiers support the results, which have important implications for the design of crowd-based fact-checking and for understanding potential polarization effects in online discourse. Overall, the paper highlights a nuanced trade-off: community fact-checks can reduce misinformation engagement but may provoke moral emotions that shape the tone and dynamics of discussion.

Abstract

Paper Structure (58 sections, 4 equations, 11 figures, 48 tables)

This paper contains 58 sections, 4 equations, 11 figures, 48 tables.

Introduction
Background and Related Work
Misinformation on Social Media
Content Moderation for Misinformation
Sentiments, Basic Emotions, and Moral Outrage on Social Media
Data and Methods
Datasets
Note dataset.
Post dataset.
Reply dataset.
Sentiments and Basic Emotions
Political vs. Non-Political Posts
Empirical Models
Dependent variables.
Display indicator and running variable.
...and 43 more sections

Figures (11)

Figure 1: Research overview. \ref{['fig:cn_example']} An example of a misleading post with a displayed community note and one direct reply after note display. \ref{['fig:before_after_display']} Illustration of our research setup. Before note display, users who reply to the source posts can only see the original post content. After note display, users who reply to the source posts can see both original post content and displayed community notes flagging the post content as misleading. Using the sentiments and emotions in replies before note display as a baseline, we can examine the changes in sentiments and emotions in replies after the display of community notes.
Figure 2: Summary statistics for misleading source posts, direct replies, and display of community notes. \ref{['fig:post_count']} The 2-week rolling average daily number of misleading source posts that are attached with displayed community notes. \ref{['fig:source_sentiments']} The means of positive and negative sentiments in the misleading source posts. \ref{['fig:source_emotions']} The means of anger, disgust, fear, joy, sadness, and surprise in the misleading source posts. \ref{['fig:reply_count']} The Complementary Cumulative Distribution Function (CCDF) for the number of replies that are directed at each misleading source post. \ref{['fig:hours_to_display']} The CCDF for hour(s) from post creation to note display. \ref{['fig:reply_ratio']} The CCDF for the ratio of replies before note display to the total replies. \ref{['fig:negative_over_time']} The hourly averages of negative sentiment in replies across hours from note display. \ref{['fig:anger_over_time']} The hourly averages of anger in replies across hours from note display. \ref{['fig:surprise_over_time']} The hourly averages of surprise in replies across hours from note display. The hourly averages of positive sentiment and other emotions are shown in Fig. \ref{['fig:emotions_over_time']} in Suppl. \ref{['sec:rdd_observation']}. The error bars (bands) represent 95% Confidence Intervals (CIs). Notably, previous studies consistently suggest a potential cold-start period of 4 hours for community notes to reach their full effect chuai2024communityrenault2024collaboratively. Therefore, we visually omit reply points within the initial four hours after the display of community notes for better readability.
Figure 3: The estimated coefficients for the independent variables -- $\bm{\mathit{Displayed}}$, $\bm{\mathit{HoursFromDisplay}}$, and source sentiments (or emotions). The independent variable $\bm{\mathit{PostAge}}$ is included during estimation but omitted in the visualization for better readability. Shown are mean values with error bars representing 95% CIs. Standard errors are clustered by source posts. The dependent variables are \ref{['fig:Positive_coefs']} positive sentiment in replies, \ref{['fig:Negative_coefs']} negative sentiment in replies, \ref{['fig:Anger_coefs']} anger in replies, \ref{['fig:Disgust_coefs']} disgust in replies, \ref{['fig:Fear_coefs']} fear in replies, \ref{['fig:Joy_coefs']} joy in replies, \ref{['fig:Sadness_coefs']} sadness in replies, and \ref{['fig:Surprise_coefs']} surprise in replies, respectively. The full estimation results are reported in Suppl. \ref{['sec:sm_main_analysis']}.
Figure 4: The predicted effects of the display of community notes on sentiments and emotions in replies. \ref{['fig:positive_hours_from_display']} The predicted effect on positive sentiment in replies. \ref{['fig:negative_hours_from_display']} The predicted effect on negative sentiment in replies. \ref{['fig:anger_hours_from_display']} The predicted effect on anger in replies. \ref{['fig:disgust_hours_from_display']} The predicted effect on disgust in replies. \ref{['fig:fear_hours_from_display']} The predicted effect on fear in replies. \ref{['fig:joy_hours_from_display']} The predicted effect on joy in replies. \ref{['fig:sadness_hours_from_display']} The predicted effect on sadness in replies. \ref{['fig:surprise_hours_from_display']} The predicted effect on surprise in replies. The grey points indicate the hourly averages of sentiments or emotions between 16 hours before and 16 hours after the display of community notes. Similar to Fig. \ref{['fig:summary_statistics']}, we visually omit reply points within the initial four hours after the display of community notes for better readability. The blues lines indicate the averages of sentiments or emotions in replies over the 16 hours before note display and represent the baselines during before-display period. The yellow lines indicate the averages of sentiments or emotions over the 16 hours after the display of community notes. The average at the yellow line is the sum of the predicted effect and the corresponding baseline at the blue line in each figure. The yellow bands represent 95% CIs. The predicted effects are transformed based on the coefficient estimates of $\bm{\mathit{Displayed}}$ (see Fig. \ref{['fig:emotion_coefs']}). The full estimation results are reported in Suppl. \ref{['sec:sm_main_analysis']}.
Figure 5: The RDD coefficient estimates (i. e., $\bm{\mathit{Displayed}}$) for sentiments and emotions in replies across three different bandwidths. The three bandwidths are 16 hours (16H), 1 week (1W), and reply lifespan (LSP), respectively. Shown are mean values with error bars representing 95% CIs. Standard errors are clustered by source posts. The full estimation results are reported in Suppl. \ref{['sec:rdd_coefs_bandwidths']}.
...and 6 more figures

Community Fact-Checks Trigger Moral Outrage in Replies to Misleading Posts on Social Media

TL;DR

Abstract

Community Fact-Checks Trigger Moral Outrage in Replies to Misleading Posts on Social Media

Authors

TL;DR

Abstract

Table of Contents

Figures (11)