Table of Contents
Fetching ...

Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference

Lei You

TL;DR

The paper addresses the challenge of inferring truth from large pools of public records when verification attention is scarce. It introduces Attention-Constrained Inference (ACI) and defines epistemic throughput as the maximum reduction in posterior uncertainty under Bayes log-loss, in a two-stage screening ($K$) followed by verification ($B$) framework. The authors derive a JaKoB scaling law, showing that the information gain per window scales as $Gain = I_{ver} B \big( p + c \sqrt{J K / B} \big)$, and prove both converse bounds and achievable policies; tail-leverage analyses reveal when expanding screening is beneficial, with heavy-tailed screening scores enabling polynomial gains. These results inform retrieval-augmented and tool-using systems and motivate a broader paradigm of epistemic communication, where truthfulness is pursued through efficiently reusable verification artifacts and tail-driven screening strategies.

Abstract

Recent generative and tool-using AI systems can surface a large volume of candidates at low marginal cost, yet only a small fraction can be checked carefully. This creates a decoder-side bottleneck: downstream decision-makers must form reliable posteriors from many public records under scarce attention. We formalize this regime via Attention-Constrained Inference (ACI), in which a cheap screening stage processes $K$ records and an expensive verification stage can follow up on at most $B$ of them. Under Bayes log-loss, we study the maximum achievable reduction in posterior uncertainty per window, which we call \emph{epistemic throughput}. Our main result is a ``JaKoB'' scaling law showing that epistemic throughput has a baseline term that grows linearly with verification and prevalence, and an additional \emph{information-leverage} term that scales as $\sqrt{JKB}$, where $J$ summarizes screening quality. Thus, expanding cheap screening can nonlinearly amplify scarce verification, even when informative records are rare. We further show that this scaling is tight in a weak-screening limit, and that in the sparse-verification regime ($B \ll K$), substantial leverage requires heavy-tailed score distributions; for light-tailed scores the amplification is only logarithmic.

Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference

TL;DR

The paper addresses the challenge of inferring truth from large pools of public records when verification attention is scarce. It introduces Attention-Constrained Inference (ACI) and defines epistemic throughput as the maximum reduction in posterior uncertainty under Bayes log-loss, in a two-stage screening () followed by verification () framework. The authors derive a JaKoB scaling law, showing that the information gain per window scales as , and prove both converse bounds and achievable policies; tail-leverage analyses reveal when expanding screening is beneficial, with heavy-tailed screening scores enabling polynomial gains. These results inform retrieval-augmented and tool-using systems and motivate a broader paradigm of epistemic communication, where truthfulness is pursued through efficiently reusable verification artifacts and tail-driven screening strategies.

Abstract

Recent generative and tool-using AI systems can surface a large volume of candidates at low marginal cost, yet only a small fraction can be checked carefully. This creates a decoder-side bottleneck: downstream decision-makers must form reliable posteriors from many public records under scarce attention. We formalize this regime via Attention-Constrained Inference (ACI), in which a cheap screening stage processes records and an expensive verification stage can follow up on at most of them. Under Bayes log-loss, we study the maximum achievable reduction in posterior uncertainty per window, which we call \emph{epistemic throughput}. Our main result is a ``JaKoB'' scaling law showing that epistemic throughput has a baseline term that grows linearly with verification and prevalence, and an additional \emph{information-leverage} term that scales as , where summarizes screening quality. Thus, expanding cheap screening can nonlinearly amplify scarce verification, even when informative records are rare. We further show that this scaling is tight in a weak-screening limit, and that in the sparse-verification regime (), substantial leverage requires heavy-tailed score distributions; for light-tailed scores the amplification is only logarithmic.
Paper Structure (46 sections, 11 theorems, 107 equations, 4 figures)

This paper contains 46 sections, 11 theorems, 107 equations, 4 figures.

Key Result

Lemma 4

Assume Assumption asmp:screening. Let $S\in\{0,1\}$ be any (possibly randomized) selection rule that depends on $Z$ and on auxiliary randomness independent of $(T,Z)$, and let $\alpha:=\mathbb{P}(S=1)$. Then

Figures (4)

  • Figure 1: JaKoB scaling and information leverage. Blind verification yields the baseline $Bp$, while perfect screening caps at the oracle ceiling $B$. With screening quality $J$ applied to $K$ candidates, screening contributes an additional $\Theta(\sqrt{JKB})$ gain (clipped at $B$); larger $JK$ shifts the curve upward, so the same information gain can be reached with a smaller scarce budget $B$ and a higher per-verification yield $\gamma=\mathrm{Gain}/(B I_{\mathrm{ver}})$.
  • Figure 2: The ACI Inference Pipeline. The system operates as a two-stage information refinery in the haystack regime ($B \ll K$). A massive volume of public records ($K$) is first filtered by a cheap, noisy screening channel ($Z$) to prioritize attention. Based on the screening scores $\eta(Z)$, only the most promising top-$B$ candidates receive expensive, high-fidelity verification ($V$). The oversampling ratio$K/B$ acts as a leverage factor, amplifying the yield of the scarce verification budget to achieve the $\Theta(\sqrt{JKB})$ scaling.
  • Figure 3: Escaping the Gaussian trap: Tail leverage determines screening utility. The figure compares the asymptotic gain from massive screening ($K \gg B$) under light-tailed (Gaussian) versus heavy-tailed (Pareto with varying $\nu$) score distributions. While Gaussian scores yield diminishing returns (scaling logarithmically as $\sqrt{\ln K}$), heavy-tailed scores provide polynomial leverage (scaling as $K^{1/\nu}$). This illustrates the critical dichotomy in the haystack regime: expanding the screening budget $K$ is highly effective only when the score distribution admits exploitable extremes.
  • Figure 4: Finite-length validation of the JaKoB scaling law. We simulate screening scores from a logistic model and apply the top-$B$ verification policy with $K=10^4$, $\delta=0.1$, and $p_0=0.01$. Markers (with $\pm2$SE error bars) report the empirical information gain (log-loss reduction, in bits). The solid curve is the benchmark prediction from Theorem \ref{['thm:tight-region']} (Eq. \ref{['eq:Dstar-finite']}); the dashed curve is the weak-screening approximation from Theorem \ref{['thm:score-achievability']}/Corollary \ref{['cor:tight-sqrt']}. Dash-dot curves show the converse upper bound from Theorem \ref{['thm:ver-tradeoff']} intersected with the finite-pool constraint, and the dotted curve indicates the finite-pool oracle. Across four screening strengths (AUC $\approx 0.55, 0.70, 0.79, 0.90$), the empirical gains closely track the benchmark and remain below the theoretical and oracle ceilings, while the weak-screening law becomes conservative as screening strengthens. The code to reproduce this plot is available at https://github.com/youlei202/Attention-Constrained-Inference.

Theorems & Definitions (24)

  • Lemma 4: Selection enrichment bound
  • Definition 5: Budgets $(K,B)$
  • Theorem 6: A tradeoff between verification and attention under log-loss
  • Corollary 7: Verification required for a target gain
  • Theorem 10: Score-based verification achieves a square-root gain
  • Corollary 11: Tight square-root scaling in a weak-screening regime
  • Proposition 12: Gaussian scores yield logarithmic tail leverage
  • Proposition 13: A Pareto right tail yields polynomial tail leverage
  • Remark 14: A fully solved benchmark
  • Theorem 16: Optimal hit rate under score-based selection
  • ...and 14 more