Debate is efficient with your time

Jonah Brown-Cohen; Geoffrey Irving; Simon C. Marshall; Ilan Newman; Georgios Piliouras; Mario Szegedy

Debate is efficient with your time

Jonah Brown-Cohen, Geoffrey Irving, Simon C. Marshall, Ilan Newman, Georgios Piliouras, Mario Szegedy

TL;DR

This work introduces Debate Query Complexity $DQC(f)$ to quantify the human oversight cost in AI safety debates, defining the minimum number of bits a verifier must inspect from the transcript to decide a Boolean function $f$. It proves that functions depending on all inputs require $\Omega(\log n)$ queries, while circuit-based upper bounds give $DQC(f) \le depth(C_f) + 1$ and $DQC(f) \le \log(size(C_f)) + 3$, culminating in the striking result that $\mathsf{PSPACE/poly} = \{f : DQC(f) \le O(\log n)\}$. The core methods combine cross-examination and adapted Karchmer–Wigderson games to show that many problems can be verified with logarithmic query complexity, making human oversight scalable. The paper also shows randomization offers limited gains in this regime and reveals a deep link between DQC lower bounds and circuit lower bounds, highlighting both practical implications for verification and foundational questions in complexity theory.

Abstract

AI safety via debate uses two competing models to help a human judge verify complex computational tasks. Previous work has established what problems debate can solve in principle, but has not analysed the practical cost of human oversight: how many queries must the judge make to the debate transcript? We introduce Debate Query Complexity}(DQC), the minimum number of bits a verifier must inspect to correctly decide a debate. Surprisingly, we find that PSPACE/poly (the class of problems which debate can efficiently decide) is precisely the class of functions decidable with O(log n) queries. This characterisation shows that debate is remarkably query-efficient: even for highly complex problems, logarithmic oversight suffices. We also establish that functions depending on all their input bits require Omega(log n) queries, and that any function computable by a circuit of size s satisfies DQC(f) <= log(s) + 3. Interestingly, this last result implies that proving DQC lower bounds of log(n) + 6 for languages in P would yield new circuit lower bounds, connecting debate query complexity to central questions in circuit complexity.

Debate is efficient with your time

TL;DR

This work introduces Debate Query Complexity

to quantify the human oversight cost in AI safety debates, defining the minimum number of bits a verifier must inspect from the transcript to decide a Boolean function

. It proves that functions depending on all inputs require

queries, while circuit-based upper bounds give

and

, culminating in the striking result that

. The core methods combine cross-examination and adapted Karchmer–Wigderson games to show that many problems can be verified with logarithmic query complexity, making human oversight scalable. The paper also shows randomization offers limited gains in this regime and reveals a deep link between DQC lower bounds and circuit lower bounds, highlighting both practical implications for verification and foundational questions in complexity theory.

Abstract

Paper Structure (11 sections, 12 theorems, 12 equations)

This paper contains 11 sections, 12 theorems, 12 equations.

Introduction
Definitions
Some basic features of a debate for f
Circuit Upper Bounds
Depth Upper Bound
Size Upper Bound via Cross-Examination
Cross-Examination and PSPACE
The Cross-Examination Lemma
Characterising PSPACE/poly
Lower bound on randomised verifiers
Conclusion

Key Result

Theorem 1

Theorems & Definitions (24)

Theorem
Definition 1: Debate Query Complexity
Definition 2: Valid transcript, valid debate system
Lemma 3
proof
Corollary 4
Theorem 5
proof
Theorem 6
proof
...and 14 more

Debate is efficient with your time

TL;DR

Abstract

Debate is efficient with your time

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (24)