Table of Contents
Fetching ...

Multi-Agent Fact Checking

Ashwin Verma, Soheil Mohajer, Behrouz Touri

TL;DR

This work addresses online learning of unknown agent unreliabilities in distributed fact-checking, modeling each agent as a memoryless Binary Symmetric Channel with crossover $\pi_i$ and the statement stream as IID binary labels. It proposes a low-memory online estimator that uses a likelihood-ratio concept and log-odds weights to update unreliability estimates via a stochastic-approximation-type rule, connected to a mean-field ODE. The authors establish almost-sure convergence to an extended equilibrium set $\bar{\mathcal{E}}$ by constructing a KL-divergence Lyapunov function and extending the domain to handle boundary cases where some agents become fully reliable or unreliable. The results provide a principled, scalable approach to real-time reliability learning in crowdsourced/fact-checking settings, with rigorous convergence guarantees and insights into the role of boundary equilibria. Potential impact includes improved robustness and efficiency of automated fact-checking in platforms with multiple imperfect verifiers.

Abstract

We formulate the problem of fake news detection using distributed fact-checkers (agents) with unknown reliability. The stream of news/statements is modeled as an independent and identically distributed binary source (to represent true and false statements). Upon observing a news, agent $i$ labels the news as true or false which reflects the true validity of the statement with some probability $1-π_i$. In other words, agent $i$ misclassified each statement with error probability $π_i\in (0,1)$, where the parameter $π_i$ models the (un)trustworthiness of agent $i$. We present an algorithm to learn the unreliability parameters, resulting in a distributed fact-checking algorithm. Furthermore, we extensively analyze the discrete-time limit of our algorithm.

Multi-Agent Fact Checking

TL;DR

This work addresses online learning of unknown agent unreliabilities in distributed fact-checking, modeling each agent as a memoryless Binary Symmetric Channel with crossover and the statement stream as IID binary labels. It proposes a low-memory online estimator that uses a likelihood-ratio concept and log-odds weights to update unreliability estimates via a stochastic-approximation-type rule, connected to a mean-field ODE. The authors establish almost-sure convergence to an extended equilibrium set by constructing a KL-divergence Lyapunov function and extending the domain to handle boundary cases where some agents become fully reliable or unreliable. The results provide a principled, scalable approach to real-time reliability learning in crowdsourced/fact-checking settings, with rigorous convergence guarantees and insights into the role of boundary equilibria. Potential impact includes improved robustness and efficiency of automated fact-checking in platforms with multiple imperfect verifiers.

Abstract

We formulate the problem of fake news detection using distributed fact-checkers (agents) with unknown reliability. The stream of news/statements is modeled as an independent and identically distributed binary source (to represent true and false statements). Upon observing a news, agent labels the news as true or false which reflects the true validity of the statement with some probability . In other words, agent misclassified each statement with error probability , where the parameter models the (un)trustworthiness of agent . We present an algorithm to learn the unreliability parameters, resulting in a distributed fact-checking algorithm. Furthermore, we extensively analyze the discrete-time limit of our algorithm.

Paper Structure

This paper contains 21 sections, 15 theorems, 90 equations, 1 figure.

Key Result

Lemma 1

For $n\geq 3$ and $\pi_i \in (0,1)\setminus\{{1}/{2}\}$ for $i\in [n]$, the set $\mathcal{S}$ defined in eq:S_def is given by ${\mathcal{S} = \{\boldsymbol{\pi}, \boldsymbol{1} - \boldsymbol{\pi}\}}$.

Figures (1)

  • Figure 1: Left: Parameter set for $n=2$ agents: the red and green lines represent the sets $\mathcal{X}_{\text{bound}}^{(1)}$ and $\mathcal{X}_{\text{bound}}^{(2)}$ respectively. The shaded region represents $\mathcal{X}$. The box excluding the blue points represents $\bar{\mathcal{X}}$. Right: Truncation set for $n=3$ agents: the red region represents the form of a set $\mathcal{K}_t$. The cube excluding the solid lines show $\bar{\mathcal{X}}$.

Theorems & Definitions (18)

  • Lemma 1
  • Definition 1
  • Theorem 2: Convergence of Online Estimator
  • Conjecture 3: Characterization of $\mathcal{E}$
  • Lemma 2
  • Lemma 3
  • Definition 2: Kullback-Leibler Divergence polyanskiy2014lecture
  • Theorem 4
  • Lemma 4
  • Corollary 1
  • ...and 8 more