Is this correct? Let's check!

Omri Ben-Eliezer; Dan Mikulincer; Elchanan Mossel; Madhu Sudan

Is this correct? Let's check!

Omri Ben-Eliezer, Dan Mikulincer, Elchanan Mossel, Madhu Sudan

TL;DR

This paper introduces the Cumulative Knowledge Process (CKP), a probabilistic tree-growth model that captures how knowledge accumulates and propagates errors under occasional checks. It analyzes two regimes—a simple CKP (no new errors) and a general CKP (with errors)—using exponential potential martingales and related Lyapunov-type functionals to determine when errors disappear versus persist, and to quantify reliability properties. The main findings reveal a phase-transition-like behavior: sufficiently frequent and deep checks (high $p$, large $k$) eliminate errors, while shallow or infrequent checks allow error propagation, with specific thresholds depending on $p$, $k$, and $\varepsilon$. The results show that when errors are eliminated, the process becomes highly reliable and most nodes reflect truth, and that surviving error structures tend to remain sublinearly large, offering theoretical insight into robustness of scientific and software knowledge with practical implications for how aggressively to allocate checking resources. The CKP framework thus provides a rigorous lens on error mitigation in cumulative knowledge systems and highlights the critical role of check depth in preventing cascading falsehoods.

Abstract

Societal accumulation of knowledge is a complex process. The correctness of new units of knowledge depends not only on the correctness of new reasoning, but also on the correctness of old units that the new one builds on. The errors in such accumulation processes are often remedied by error correction and detection heuristics. Motivating examples include the scientific process based on scientific publications, and software development based on libraries of code. Natural processes that aim to keep errors under control, such as peer review in scientific publications, and testing and debugging in software development, would typically check existing pieces of knowledge -- both for the reasoning that generated them and the previous facts they rely on. In this work, we present a simple process that models such accumulation of knowledge and study the persistence (or lack thereof) of errors. We consider a simple probabilistic model for the generation of new units of knowledge based on the preferential attachment growth model, which additionally allows for errors. Furthermore, the process includes checks aimed at catching these errors. We investigate when effects of errors persist forever in the system (with positive probability) and when they get rooted out completely by the checking process. The two basic parameters associated with the checking process are the {\em probability} of conducting a check and the depth of the check. We show that errors are rooted out if checks are sufficiently frequent and sufficiently deep. In contrast, shallow or infrequent checks are insufficient to root out errors.

Is this correct? Let's check!

TL;DR

, large

) eliminate errors, while shallow or infrequent checks allow error propagation, with specific thresholds depending on

, and

. The results show that when errors are eliminated, the process becomes highly reliable and most nodes reflect truth, and that surviving error structures tend to remain sublinearly large, offering theoretical insight into robustness of scientific and software knowledge with practical implications for how aggressively to allocate checking resources. The CKP framework thus provides a rigorous lens on error mitigation in cumulative knowledge systems and highlights the critical role of check depth in preventing cascading falsehoods.

Abstract

Paper Structure (31 sections, 26 theorems, 89 equations)

This paper contains 31 sections, 26 theorems, 89 equations.

Introduction
Related Work
Noisy Computation and the PMC Model.
Local Error Correction and the "Positive Rate" Conjecture.
The Reproducibility and Replication Crisis.
Knowledge Aggregation vs. Information Spreading.
Models
Semantics:
Parameters:
State evolution:
Initial state:
The simple CKP:
Phenomena we care about:
PT components:
Proof Ideas and Techniques
...and 16 more sections

Key Result

Theorem 2.1

For all $p \geq \frac{6}{7}$ and $k \geq 4$, the error effect in the $(p,k)$-simple CKP is completely eliminated.

Theorems & Definitions (52)

Definition 1.1: Properties of CKPs
Definition 1.2: PT Component
Theorem 2.1: Error effect elimination in the simple model
Theorem 2.2: Error effect survival in the simple model
Theorem 2.3: Error effect survival when $k = 2$
Theorem 2.4
Theorem 2.5: Error effect elimination in the general model
Theorem 2.6: Error effect survival in the general model
Theorem 2.7
Theorem 3.1: Error effect elimination in the simple model
...and 42 more

Is this correct? Let's check!

TL;DR

Abstract

Is this correct? Let's check!

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (52)