Errors are Robustly Tamed in Cumulative Knowledge Processes
Anna Brandenberger, Cassandra Marcussen, Elchanan Mossel, Madhu Sudan
TL;DR
This work develops a broad framework for cumulative knowledge processes (CKPs) that capture how new knowledge units attach to existing ones, how errors can arise and propagate, and how local checks can detect and eliminate these errors. By introducing a flexible attachment mechanism a(d), a bounded combination factor M, and an adversarial component (q,r) alongside a local checking procedure with radius k and probability p, the authors prove robust error-elimination results that hold across all CKPs with regular attachments, provided the adversary is sufficiently limited and checking is sufficiently frequent/deep. They also identify regimes where errors can persist (error survival), derive potential-based analyses (minimum-distance and minimal-false/leaf potentials) to track the dynamics, and establish monotonicity results for the simple tree-CKP with respect to checking parameters p and k. The findings imply that preserving the quality of large, interdependent knowledge corpora is feasible under natural, limited-cost checking strategies, even in the presence of adversarial insertions and diverse growth rules. These insights offer a theoretical foundation for designing resilient scholarly and software knowledge ecosystems and suggest concrete directions for future exploration, including more general attachment schemes and phase-transition characterizations.
Abstract
We study processes of societal knowledge accumulation, where the validity of a new unit of knowledge depends both on the correctness of its derivation and on the validity of the units it depends on. A fundamental question in this setting is: If a constant fraction of the new derivations is wrong, can investing a constant fraction, bounded away from one, of effort ensure that a constant fraction of knowledge in society is valid? Ben-Eliezer, Mikulincer, Mossel, and Sudan (ITCS 2023) introduced a concrete probabilistic model to analyze such questions and showed an affirmative answer to this question. Their study, however, focuses on the simple case where each new unit depends on just one existing unit, and units attach according to a $\textit{preferential attachment rule}$. In this work, we consider much more general families of cumulative knowledge processes, where new units may attach according to varied attachment mechanisms and depend on multiple existing units. We also allow a (random) fraction of insertions of adversarial nodes. We give a robust affirmative answer to the above question by showing that for $\textit{all}$ of these models, as long as many of the units follow simple heuristics for checking a bounded number of units they depend on, all errors will be eventually eliminated. Our results indicate that preserving the quality of large interdependent collections of units of knowledge is feasible, as long as careful but not too costly checks are performed when new units are derived/deposited.
