Debate is efficient with your time
Jonah Brown-Cohen, Geoffrey Irving, Simon C. Marshall, Ilan Newman, Georgios Piliouras, Mario Szegedy
TL;DR
This work introduces Debate Query Complexity $DQC(f)$ to quantify the human oversight cost in AI safety debates, defining the minimum number of bits a verifier must inspect from the transcript to decide a Boolean function $f$. It proves that functions depending on all inputs require $\Omega(\log n)$ queries, while circuit-based upper bounds give $DQC(f) \le depth(C_f) + 1$ and $DQC(f) \le \log(size(C_f)) + 3$, culminating in the striking result that $\mathsf{PSPACE/poly} = \{f : DQC(f) \le O(\log n)\}$. The core methods combine cross-examination and adapted Karchmer–Wigderson games to show that many problems can be verified with logarithmic query complexity, making human oversight scalable. The paper also shows randomization offers limited gains in this regime and reveals a deep link between DQC lower bounds and circuit lower bounds, highlighting both practical implications for verification and foundational questions in complexity theory.
Abstract
AI safety via debate uses two competing models to help a human judge verify complex computational tasks. Previous work has established what problems debate can solve in principle, but has not analysed the practical cost of human oversight: how many queries must the judge make to the debate transcript? We introduce Debate Query Complexity}(DQC), the minimum number of bits a verifier must inspect to correctly decide a debate. Surprisingly, we find that PSPACE/poly (the class of problems which debate can efficiently decide) is precisely the class of functions decidable with O(log n) queries. This characterisation shows that debate is remarkably query-efficient: even for highly complex problems, logarithmic oversight suffices. We also establish that functions depending on all their input bits require Omega(log n) queries, and that any function computable by a circuit of size s satisfies DQC(f) <= log(s) + 3. Interestingly, this last result implies that proving DQC lower bounds of log(n) + 6 for languages in P would yield new circuit lower bounds, connecting debate query complexity to central questions in circuit complexity.
