Community-based fact-checking reduces the spread of misleading posts on social media

Yuwei Chuai; Moritz Pilarski; Thomas Renault; David Restrepo-Amariles; Aurore Troussel-Clément; Gabriele Lenzini; Nicolas Pröllochs

Community-based fact-checking reduces the spread of misleading posts on social media

Yuwei Chuai, Moritz Pilarski, Thomas Renault, David Restrepo-Amariles, Aurore Troussel-Clément, Gabriele Lenzini, Nicolas Pröllochs

TL;DR

Exposing users to community notes reduced the spread of misleading posts by, on average, 62.0% and suggested that community notes might be too slow to intervene in the early stage of the diffusion.

Abstract

Community-based fact-checking is a promising approach to verify social media content and correct misleading posts at scale. Yet, causal evidence regarding its effectiveness in reducing the spread of misinformation on social media is missing. Here, we performed a large-scale empirical study to analyze whether community notes reduce the spread of misleading posts on X. Using a Difference-in-Differences design and repost time series data for N=237,677 (community fact-checked) cascades that had been reposted more than 431 million times, we found that exposing users to community notes reduced the spread of misleading posts by, on average, 62.0%. Furthermore, community notes increased the odds that users delete their misleading posts by 103.4%. However, our findings also suggest that community notes might be too slow to intervene in the early (and most viral) stage of the diffusion. Our work offers important implications to enhance the effectiveness of community-based fact-checking approaches on social media.

Community-based fact-checking reduces the spread of misleading posts on social media

TL;DR

Exposing users to community notes reduced the spread of misleading posts by, on average, 62.0% and suggested that community notes might be too slow to intervene in the early stage of the diffusion.

Abstract

Paper Structure (16 sections, 4 equations, 8 figures, 14 tables)

This paper contains 16 sections, 4 equations, 8 figures, 14 tables.

Descriptive Statistics
Data Overview
Fact-Checking Activity Over Time
Stability of Note Status
Identification of Topics
Propensity Score Matching
Estimation Results
Two-Period ATTs
Parallel Test and Multi-Period ATTs
Placebo Analyses
Sensitivity Analyses
Sensitivity Across Response Time
Sensitivity Across Months From Roll-Out
Sensitivity Across Rating Thresholds
Sensitivity Across User and Post Characteristics
...and 1 more sections

Figures (8)

Figure 1: Data overview.(a) An example of a community note displayed on a misleading post on X. (b) Two-week rolling averages of the daily counts of fact-checked source posts, community notes, and displayed notes in our observation period from October 6, 2022 to June 11, 2024. (c) The cumulative distribution of the ratios of posts that received displayed notes at different post ages relative to all posts with displayed notes. (d) The cumulative distribution of the ratios of reposts at different post ages relative to all reposts within 36 hours. (e) The repost counts of retrieved posts (upper half plot, shown are median values) and the ratios of deleted posts (bottom half plot, shown are mean values) within groups of posts with displayed notes and posts without displayed notes. The error bars represent 99% Confidence Intervals (CIs).
Figure 2: Community notes reduce the spread of misleading posts on X.(a) Time series of repost counts within 15 minutes intervals in the treatment group (red) and the control group (blue) from 4 hours before the display of community notes to 12 hours after the display of community notes. The error bands represent 99% Confidence Intervals (CIs). (b) Two-period (purple) and multi-period (yellow) ATTs estimated using a Difference-in-Differences (DiD) design and negative binomial regression models. For the two-period DiD model, the ATT is calculated as $\Delta$ Treatment $-$$\Delta$ Control $=$ -0.62. For the multi-period DiD model, the yellow circles and error bars show the estimated hourly multi-period ATTs (with 99% CIs). The grey band (with 99% CIs) visualizes the observed extra reduction of the ratio of reposts in the treatment group relative to reposts in the control group and compared to the ratio of reposts before the display of community notes. The ATT estimations are based on 614520.0 repost time series observations for $N =$ 40968.0 posts. Post-level random effects are included. Full estimation results are in \ref{['tab:did_main_fixed', 'tab:did_main_multi']}.
Figure 3: Sensitivity analysis.(a) The estimated ATTs across different response times from post creation to note display (grouped within 4-hour windows). Full regression results are reported in \ref{['tab:did_sensitivity_cohorts']}. (b) The estimated ATTs of community notes across the months following the roll-out of "Community Notes" program in October 2022. Full regression results are reported in \ref{['tab:did_sensitivity_mfro']}. (c) The estimated ATTs of community notes depending on the number of ratings (10--80) by other fact-checking contributors. Here, we consider only community notes for misleading posts that have never been rated as not helpful. Full regression results are reported in \ref{['tab:did_sensitivity_ratings']}. (d) The estimated ATTs within subgroups separated by user and post characteristics. Full regression results are reported in \ref{['tab:did_sensitivity_user_charcs']}, \ref{['tab:did_sensitivity_post_charcs1']}, and \ref{['tab:did_sensitivity_post_charcs2']}. In all plots, the error bars represent 99% CIs and the grey bands visualize the ATT (with 99% CIs) estimated via the two-period DiD model from our main analysis.
Figure 4: Effect of community notes on the cumulative repost count of misleading posts on X.(a) The changes in the ratio of the reduction of reposts relative to the predicted overall repost count over the response time from post creation to note display (grouped within 1-hour windows). The grey band ranges between the mean and the median of the ratio of the reduction. (b) The estimated cumulative count of reposts that community notes prevents at different post ages. The error band represents 99% CIs. (c) CCDFs showing the actually observed repost count for source posts with displayed community notes and the predicted repost count that the source posts would have received in the absence of community notes display. (d) The estimated overall reduction in reposts if all posts with community notes would have been displayed simultaneously from 2 to 36 hours after post creation. Statistical significance ($^{*}p<0.01$; $^{**}p<0.005$; $^{***}p<0.001$) was calculated using two-tailed KS tests.
Figure 5: Effect of community notes on the deletion of misleading posts on X. Shown are the ratios of deleted posts for different values of note helpfulness. Note helpfulness scores are re-centered based on the cutoff point of 0.4. Only community notes with note helpfulness scores of 0.4 or above are displayed on the corresponding misleading posts. The note helpfulness scores are rounded to two decimal places. Notes with helpfulness scores of 0.39 are omitted to prevent treatment contamination due to fluctuations between recalculated note helpfulness scores and note helpfulness scores used in production. The error bands represent 99% CIs. See \ref{['supp:deletedPosts']} for further details and full estimation results.
...and 3 more figures

Community-based fact-checking reduces the spread of misleading posts on social media

TL;DR

Abstract

Community-based fact-checking reduces the spread of misleading posts on social media

Authors

TL;DR

Abstract

Table of Contents

Figures (8)