Table of Contents
Fetching ...

The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work

Jacky Liang

Abstract

We propose a stylized model of human-AI collaboration that isolates a mechanism we call the novelty bottleneck: the fraction of a task requiring human judgment creates an irreducible serial component analogous to Amdahl's Law in parallel computing. The model assumes that tasks decompose into atomic decisions, a fraction $ν$ of which are "novel" (not covered by the agent's prior), and that specification, verification, and error correction each scale with task size. From these assumptions, we derive several non-obvious consequences: (1) there is no smooth sublinear regime for human effort it transitions sharply from $O(E)$ to $O(1)$ with no intermediate scaling class; (2) better agents improve the coefficient on human effort but not the exponent; (3) for organizations of n humans with AI agents, optimal team size decreases with agent capability; (4) wall-clock time achieves $O(\sqrt{E})$ through team parallelism but total human effort remains $O(E)$; and (5) the resulting AI safety profile is asymmetric -- AI is bottlenecked on frontier research but unbottlenecked on exploiting existing knowledge. We show these predictions are consistent with empirical observations from AI coding benchmarks, scientific productivity data, and practitioner reports. Our contribution is not a proof that human effort must scale linearly, but a framework that identifies the novelty fraction as the key parameter governing AI-assisted productivity, and derives consequences that clarify -- rather than refute -- prevalent narratives about intelligence explosions and the "country of geniuses in a data center."

The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work

Abstract

We propose a stylized model of human-AI collaboration that isolates a mechanism we call the novelty bottleneck: the fraction of a task requiring human judgment creates an irreducible serial component analogous to Amdahl's Law in parallel computing. The model assumes that tasks decompose into atomic decisions, a fraction of which are "novel" (not covered by the agent's prior), and that specification, verification, and error correction each scale with task size. From these assumptions, we derive several non-obvious consequences: (1) there is no smooth sublinear regime for human effort it transitions sharply from to with no intermediate scaling class; (2) better agents improve the coefficient on human effort but not the exponent; (3) for organizations of n humans with AI agents, optimal team size decreases with agent capability; (4) wall-clock time achieves through team parallelism but total human effort remains ; and (5) the resulting AI safety profile is asymmetric -- AI is bottlenecked on frontier research but unbottlenecked on exploiting existing knowledge. We show these predictions are consistent with empirical observations from AI coding benchmarks, scientific productivity data, and practitioner reports. Our contribution is not a proof that human effort must scale linearly, but a framework that identifies the novelty fraction as the key parameter governing AI-assisted productivity, and derives consequences that clarify -- rather than refute -- prevalent narratives about intelligence explosions and the "country of geniuses in a data center."

Paper Structure

This paper contains 37 sections, 1 theorem, 10 equations, 5 figures, 5 tables.

Key Result

Proposition 1

Under assumptions A1--A5, $H$ transitions sharply from $O(E)$ to $O(1)$ with no intermediate sublinear scaling class. Specifically, for $H$ to be $o(E)$, all of the following must hold simultaneously: (i) $\nu = 0$, (ii) $c_v = 0$, (iii) $c_c = 0$, and (iv) $c_d = 0$.

Figures (5)

  • Figure 1: Simulation results across configurations. Top left: $H$ vs. $E$ is linear for all configurations. Top center: $H/E$ ratio converges to a constant. Top right: Agent trajectory divergence grows as $O(\sqrt{E})$, confirming the random walk model. Bottom left: Higher mutual information reduces the linear coefficient but not the scaling class. Bottom center: Even 1% novelty makes $H/E$ converge to a positive constant. Bottom right: Stacked decomposition of $H$ for the high-novelty case.
  • Figure 2: Left: Log-log plot confirming linear scaling ($\alpha \approx 1$) across all configurations. Reference lines for $O(E)$, $O(\sqrt{E})$, and $O(\log E)$ shown. Right: Fitted scaling exponents; all configurations cluster tightly around $\alpha = 1.0$.
  • Figure 3: March of nines analysis. Left: End-to-end reliability decays exponentially with task length. Center: Required per-step nines grow with $E$ (log scale). Right: Human checkpoints required scale linearly with $E$.
  • Figure 4: Verifiability frontier. Left: $H/E$ ratio across the novelty-verifiability plane. Center: Verifiability reduces the slope but does not eliminate linear scaling. Right: Concrete task types placed in the novelty-verifiability space.
  • Figure 5: Organizational scaling results. Top left: Wall-clock time vs. team size shows U-shaped curves with optimal $n^*$ (dots) that shifts left as agents improve. Top center: Total human effort grows superlinearly past $n^*$. Top right: Optimal team size grows as $O(\sqrt{E})$ but shrinks with agent capability. Bottom left: Better agents systematically reduce optimal team size. Bottom center: Minimum achievable wall-clock time scales as $O(\sqrt{E})$ for all configurations. Bottom right: Work efficiency (fraction of effort spent on useful work vs. coordination) decays rapidly with team size, faster for more capable agents.

Theorems & Definitions (5)

  • Definition 1: Human Effort
  • Definition 2: Mutual Information / Shared Prior
  • Definition 3: Novelty
  • Proposition 1: No Sublinear Regime
  • Remark 1