Supernotes: Driving Consensus in Crowd-Sourced Fact-Checking

Soham De; Michiel A. Bakker; Jay Baxter; Martin Saveski

Supernotes: Driving Consensus in Crowd-Sourced Fact-Checking

Soham De, Michiel A. Bakker, Jay Baxter, Martin Saveski

TL;DR

This framework uses an LLM to generate many diverse Supernote candidates from existing proposed notes, and shows that both the LLM-based candidate generation and the consensus-driven scoring play crucial roles in creating notes that effectively build consensus among diverse users.

Abstract

X's Community Notes, a crowd-sourced fact-checking system, allows users to annotate potentially misleading posts. Notes rated as helpful by a diverse set of users are prominently displayed below the original post. While demonstrably effective at reducing misinformation's impact when notes are displayed, there is an opportunity for notes to appear on many more posts: for 91% of posts where at least one note is proposed, no notes ultimately achieve sufficient support from diverse users to be shown on the platform. This motivates the development of Supernotes: AI-generated notes that synthesize information from several existing community notes and are written to foster consensus among a diverse set of users. Our framework uses an LLM to generate many diverse Supernote candidates from existing proposed notes. These candidates are then evaluated by a novel scoring model, trained on millions of historical Community Notes ratings, selecting candidates that are most likely to be rated helpful by a diverse set of users. To test our framework, we ran a human subjects experiment in which we asked participants to compare the Supernotes generated by our framework to the best existing community notes for 100 sample posts. We found that participants rated the Supernotes as significantly more helpful, and when asked to choose between the two, preferred the Supernotes 75.2% of the time. Participants also rated the Supernotes more favorably than the best existing notes on quality, clarity, coverage, context, and argumentativeness. Finally, in a follow-up experiment, we asked participants to compare the Supernotes against LLM-generated summaries and found that the participants rated the Supernotes significantly more helpful, demonstrating that both the LLM-based candidate generation and the consensus-driven scoring play crucial roles in creating notes that effectively build consensus among diverse users.

Supernotes: Driving Consensus in Crowd-Sourced Fact-Checking

TL;DR

Abstract

Supernotes: Driving Consensus in Crowd-Sourced Fact-Checking

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)