Table of Contents
Fetching ...

Debate is efficient with your time

Jonah Brown-Cohen, Geoffrey Irving, Simon C. Marshall, Ilan Newman, Georgios Piliouras, Mario Szegedy

TL;DR

This work introduces Debate Query Complexity $DQC(f)$ to quantify the human oversight cost in AI safety debates, defining the minimum number of bits a verifier must inspect from the transcript to decide a Boolean function $f$. It proves that functions depending on all inputs require $\Omega(\log n)$ queries, while circuit-based upper bounds give $DQC(f) \le depth(C_f) + 1$ and $DQC(f) \le \log(size(C_f)) + 3$, culminating in the striking result that $\mathsf{PSPACE/poly} = \{f : DQC(f) \le O(\log n)\}$. The core methods combine cross-examination and adapted Karchmer–Wigderson games to show that many problems can be verified with logarithmic query complexity, making human oversight scalable. The paper also shows randomization offers limited gains in this regime and reveals a deep link between DQC lower bounds and circuit lower bounds, highlighting both practical implications for verification and foundational questions in complexity theory.

Abstract

AI safety via debate uses two competing models to help a human judge verify complex computational tasks. Previous work has established what problems debate can solve in principle, but has not analysed the practical cost of human oversight: how many queries must the judge make to the debate transcript? We introduce Debate Query Complexity}(DQC), the minimum number of bits a verifier must inspect to correctly decide a debate. Surprisingly, we find that PSPACE/poly (the class of problems which debate can efficiently decide) is precisely the class of functions decidable with O(log n) queries. This characterisation shows that debate is remarkably query-efficient: even for highly complex problems, logarithmic oversight suffices. We also establish that functions depending on all their input bits require Omega(log n) queries, and that any function computable by a circuit of size s satisfies DQC(f) <= log(s) + 3. Interestingly, this last result implies that proving DQC lower bounds of log(n) + 6 for languages in P would yield new circuit lower bounds, connecting debate query complexity to central questions in circuit complexity.

Debate is efficient with your time

TL;DR

This work introduces Debate Query Complexity to quantify the human oversight cost in AI safety debates, defining the minimum number of bits a verifier must inspect from the transcript to decide a Boolean function . It proves that functions depending on all inputs require queries, while circuit-based upper bounds give and , culminating in the striking result that . The core methods combine cross-examination and adapted Karchmer–Wigderson games to show that many problems can be verified with logarithmic query complexity, making human oversight scalable. The paper also shows randomization offers limited gains in this regime and reveals a deep link between DQC lower bounds and circuit lower bounds, highlighting both practical implications for verification and foundational questions in complexity theory.

Abstract

AI safety via debate uses two competing models to help a human judge verify complex computational tasks. Previous work has established what problems debate can solve in principle, but has not analysed the practical cost of human oversight: how many queries must the judge make to the debate transcript? We introduce Debate Query Complexity}(DQC), the minimum number of bits a verifier must inspect to correctly decide a debate. Surprisingly, we find that PSPACE/poly (the class of problems which debate can efficiently decide) is precisely the class of functions decidable with O(log n) queries. This characterisation shows that debate is remarkably query-efficient: even for highly complex problems, logarithmic oversight suffices. We also establish that functions depending on all their input bits require Omega(log n) queries, and that any function computable by a circuit of size s satisfies DQC(f) <= log(s) + 3. Interestingly, this last result implies that proving DQC lower bounds of log(n) + 6 for languages in P would yield new circuit lower bounds, connecting debate query complexity to central questions in circuit complexity.
Paper Structure (11 sections, 12 theorems, 12 equations)

This paper contains 11 sections, 12 theorems, 12 equations.

Key Result

Theorem 1

Theorems & Definitions (24)

  • Theorem
  • Definition 1: Debate Query Complexity
  • Definition 2: Valid transcript, valid debate system
  • Lemma 3
  • proof
  • Corollary 4
  • Theorem 5
  • proof
  • Theorem 6
  • proof
  • ...and 14 more