Table of Contents
Fetching ...

Scaling Human Judgment in Community Notes with LLMs

Haiwen Li, Soham De, Manon Revel, Andreas Haupt, Brad Miller, Keith Coleman, Jay Baxter, Martin Saveski, Michiel A. Bakker

TL;DR

The paper addresses scaling Community Notes in the era of large language models while preserving trust by keeping humans as the ultimate arbiters of note usefulness. It proposes a hybrid human-LLM pipeline in which both can generate notes, but human raters curate and surface only notes deemed helpful, via a bridging algorithm and a Reinforcement Learning from Community Feedback loop. The authors outline a concrete research agenda (RLHF-inspired RLCF, AI co-pilots, rater tools, intelligent note matching, algorithm evolution, and open infrastructure) to realize this system and discuss risks such as persuasive but inaccurate notes and rater overload. If successful, this approach could dramatically increase scale and diversity of contextual notes, serving as a pluralistic information-layer for the web and informing responsible AI alignment.

Abstract

This paper argues for a new paradigm for Community Notes in the LLM era: an open ecosystem where both humans and LLMs can write notes, and the decision of which notes are helpful enough to show remains in the hands of humans. This approach can accelerate the delivery of notes, while maintaining trust and legitimacy through Community Notes' foundational principle: A community of diverse human raters collectively serve as the ultimate evaluator and arbiter of what is helpful. Further, the feedback from this diverse community can be used to improve LLMs' ability to produce accurate, unbiased, broadly helpful notes--what we term Reinforcement Learning from Community Feedback (RLCF). This becomes a two-way street: LLMs serve as an asset to humans--helping deliver context quickly and with minimal effort--while human feedback, in turn, enhances the performance of LLMs. This paper describes how such a system can work, its benefits, key new risks and challenges it introduces, and a research agenda to solve those challenges and realize the potential of this approach.

Scaling Human Judgment in Community Notes with LLMs

TL;DR

The paper addresses scaling Community Notes in the era of large language models while preserving trust by keeping humans as the ultimate arbiters of note usefulness. It proposes a hybrid human-LLM pipeline in which both can generate notes, but human raters curate and surface only notes deemed helpful, via a bridging algorithm and a Reinforcement Learning from Community Feedback loop. The authors outline a concrete research agenda (RLHF-inspired RLCF, AI co-pilots, rater tools, intelligent note matching, algorithm evolution, and open infrastructure) to realize this system and discuss risks such as persuasive but inaccurate notes and rater overload. If successful, this approach could dramatically increase scale and diversity of contextual notes, serving as a pluralistic information-layer for the web and informing responsible AI alignment.

Abstract

This paper argues for a new paradigm for Community Notes in the LLM era: an open ecosystem where both humans and LLMs can write notes, and the decision of which notes are helpful enough to show remains in the hands of humans. This approach can accelerate the delivery of notes, while maintaining trust and legitimacy through Community Notes' foundational principle: A community of diverse human raters collectively serve as the ultimate evaluator and arbiter of what is helpful. Further, the feedback from this diverse community can be used to improve LLMs' ability to produce accurate, unbiased, broadly helpful notes--what we term Reinforcement Learning from Community Feedback (RLCF). This becomes a two-way street: LLMs serve as an asset to humans--helping deliver context quickly and with minimal effort--while human feedback, in turn, enhances the performance of LLMs. This paper describes how such a system can work, its benefits, key new risks and challenges it introduces, and a research agenda to solve those challenges and realize the potential of this approach.

Paper Structure

This paper contains 10 sections, 2 figures.

Figures (2)

  • Figure 1: An expansion of the Community Notes pipeline from "all-human’’ to a hybrid "human-LLM’’ model. Today (top): Human note writers draft proposed notes in response to a misleading post, and other human contributors rate their helpfulness; the bridging algorithm picks the broadly helpful notes. Future (bottom): LLMs will also participate in the writing stage, producing notes or assisting human writers, while the rating stage remains human-only. Community feedback from human raters flows back to improve LLM note generation (RLCF).
  • Figure 2: A research agenda for LLM-powered Community Notes. (1) Customized LLMs for note generation; (2) AI 'co-pilots’ for human writers; (3) AI assistance for human raters; (4) Intelligent note 'matching’ adapts existing helpful notes to new, similar contexts; (5) Evolving the core algorithm for AI-generated content; (6) Building a robust and open infrastructure.