Corrective or Backfire: Characterizing and Predicting User Response to Social Correction

Bing He; Yingchen Ma; Mustaque Ahamad; Srijan Kumar

Corrective or Backfire: Characterizing and Predicting User Response to Social Correction

Bing He, Yingchen Ma, Mustaque Ahamad, Srijan Kumar

TL;DR

The paper addresses how real-world users respond to social correction of misinformation on Twitter by building a large, conversational dataset of misinformation tweets, counter-replies, and responses. It introduces a taxonomy of user responses, conducts fine-grained linguistic, engagement, and poster-level analyses to identify signals of corrective versus backfire effects, and develops a high-performing predictive model (F1 up to 0.816) to forecast the outcome of counter-replies. Key contributions include the dataset with millions of tweets, a structured response taxonomy, and actionable insights on features that correlate with corrective impact, enabling platforms and fact-checkers to prioritize effective social-correction efforts. The work provides practical guidance for maximizing corrective impact while mitigating backfire in real-world misinformation mitigation.

Abstract

Online misinformation poses a global risk with harmful implications for society. Ordinary social media users are known to actively reply to misinformation posts with counter-misinformation messages, which is shown to be effective in containing the spread of misinformation. Such a practice is defined as "social correction". Nevertheless, it remains unknown how users respond to social correction in real-world scenarios, especially, will it have a corrective or backfire effect on users. Investigating this research question is pivotal for developing and refining strategies that maximize the efficacy of social correction initiatives. To fill this gap, we conduct an in-depth study to characterize and predict the user response to social correction in a data-driven manner through the lens of X (Formerly Twitter), where the user response is instantiated as the reply that is written toward a counter-misinformation message. Particularly, we first create a novel dataset with 55, 549 triples of misinformation tweets, counter-misinformation replies, and responses to counter-misinformation replies, and then curate a taxonomy to illustrate different kinds of user responses. Next, fine-grained statistical analysis of reply linguistic and engagement features as well as repliers' user attributes is conducted to illustrate the characteristics that are significant in determining whether a reply will have a corrective or backfire effect. Finally, we build a user response prediction model to identify whether a social correction will be corrective, neutral, or have a backfire effect, which achieves a promising F1 score of 0.816. Our work enables stakeholders to monitor and predict user responses effectively, thus guiding the use of social correction to maximize their corrective impact and minimize backfire effects. The code and data is accessible on https://github.com/claws-lab/response-to-social-correction.

Corrective or Backfire: Characterizing and Predicting User Response to Social Correction

TL;DR

Abstract

Paper Structure (29 sections, 4 figures, 3 tables)

This paper contains 29 sections, 4 figures, 3 tables.

Introduction
Related Works
Social Correction on Social Media Platforms
User Response to Misinformation Correction
Backfire and Corrective Effects of Misinformation Correction
Dataset
Definition
Misinformation Tweet
Counter-misinformation Reply (i.e., Counter-reply)
Reply to Counter-reply (i.e., Response)
Task Objective
Dataset Curation
Misinformation Tweet Collection and Classification
Counter-reply Collection and Classification
Counter-reply Poster Attribute Collection
...and 14 more sections

Figures (4)

Figure 1: Examples of user responses to social correction. Here, the social correction is the counter-misinformation reply posted by ordinary users (the second row), and the user response is the reply to the counter-misinformation reply (the third row).
Figure 2: Illustration of prompts used in GPT-4 annotation.
Figure 3: Distributions of the total number of responses (black), number of misinformation-disbelieving responses (green), number of misinformation-believing responses (red), and number of neutral responses (gray) per counter-reply, each presented on a log scale.
Figure 4: Distributions of the total number of counter-replies (black), number of corrective counter-replies (green), number of backfire counter-replies (red), and number of neutral replies (gray) based on response per counter-reply, each presented on a log scale.

Corrective or Backfire: Characterizing and Predicting User Response to Social Correction

TL;DR

Abstract

Corrective or Backfire: Characterizing and Predicting User Response to Social Correction

Authors

TL;DR

Abstract

Table of Contents

Figures (4)