Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

Alejandro Rodriguez Dominguez

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

Alejandro Rodriguez Dominguez

TL;DR

A unified theory is developed showing that whenever the human supervision channel is not sufficient for a latent evaluation target, it acts as an information-reducing channel that induces a strictly positive excess-risk floor for any learner dominated by it.

Abstract

Large language models are trained primarily on human-generated data and feedback, yet they exhibit persistent errors arising from annotation noise, subjective preferences, and the limited expressive bandwidth of natural language. We argue that these limitations reflect structural properties of the supervision channel rather than model scale or optimization. We develop a unified theory showing that whenever the human supervision channel is not sufficient for a latent evaluation target, it acts as an information-reducing channel that induces a strictly positive excess-risk floor for any learner dominated by it. We formalize this Human-Bounded Intelligence limit and show that across six complementary frameworks (operator theory, PAC-Bayes, information theory, causal inference, category theory, and game-theoretic analyses of reinforcement learning from human feedback), non-sufficiency yields strictly positive lower bounds arising from the same structural decomposition into annotation noise, preference distortion, and semantic compression. The theory explains why scaling alone cannot eliminate persistent human-aligned errors and characterizes conditions under which auxiliary non-human signals (e.g., retrieval, program execution, tools) increase effective supervision capacity and collapse the floor by restoring information about the latent target. Experiments on real preference data, synthetic known-target tasks, and externally verifiable benchmarks confirm the predicted structural signatures: human-only supervision exhibits a persistent floor, while sufficiently informative auxiliary channels strictly reduce or eliminate excess error.

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

TL;DR

Abstract

Paper Structure (23 sections, 8 theorems, 58 equations, 3 figures, 6 tables)

This paper contains 23 sections, 8 theorems, 58 equations, 3 figures, 6 tables.

Introduction
Related Work
Formal Framework and HBI Theorem
Ground Truth, Human Channel, and Bias
Learners and Assumptions
Excess Risk and the Human-Bounded Limit
Proof of Theorem \ref{['thm:HBI']}
Instantiations Across Six Frameworks
Operator-Theoretic Limit
PAC-Bayesian Limit
Information-Theoretic Limit
Causal Non-Identifiability
Category-Theoretic Formulation
RLHF as a Biased Fixed Point
Breaking the Human-Bounded Limit
...and 8 more sections

Key Result

Theorem 1

Under Assumptions ass:human-only--ass:min-sep and the regularity conditions,

Figures (3)

Figure 1: Conceptual information flow under human-only (H), hybrid human+model (H+M), and hybrid with auxiliary channels (H+M+A). Auxiliary channels introduce additional information about $Y^\ast$, increasing effective supervision capacity and reducing or eliminating the structural excess-risk floor.
Figure 2: Real-data scaling behavior. Pairwise accuracy versus training size for human-only supervision ($\alpha=1$, blue) and hybrid supervision ($\alpha=0.5$, orange). Hybrid supervision matches or exceeds human-only performance across scales, while scaling alone does not eliminate the structural supervision gap
Figure 3: Synthetic distortion trajectory. Objective accuracy as a function of the human-weight parameter $\alpha$ in the known-target synthetic task. Distortion increases monotonically toward human-only supervision ($\alpha = 1$), confirming the predicted structural alignment gap.

Theorems & Definitions (16)

Theorem 1: Human-Bounded Intelligence (HBI)
proof
Theorem 2: Operator-Theoretic HBI
proof
Theorem 3: PAC-Bayes HBI
proof
Theorem 4: Information-Theoretic HBI
proof
Theorem 5: Causal HBI
proof
...and 6 more

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

TL;DR

Abstract

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (16)