Optimal sets of questions for Twenty Questions
Yuval Filmus, Idan Mehalel
TL;DR
This work determines the minimal size of a universal optimal set of yes/no questions for the distributional Twenty Questions problem, tying the growth to a function G(β) through ρ_min(n) = 2^{G(β)n ± o(n)} for n = β·2^k. It provides a variational formula for G(β), proves its finiteness, and shows that 2^{-G(β)} ≤ 1.25 with equality only at β = 1.25, while also improving the known lower bound on q(n) to at least 1.236^{n−o(n)}. Extending to d-ary questions, the authors obtain q^{(d)}(n) ≤ (1 + (d−1)/d^{d/(d−1)})^{n+o(n)} and show this bound is tight for infinitely many n, with a uniform regime for d = o(n/log^2 n). The results hinge on reducing q(n) to a density parameter ρ_min(n) over dyadic (and d-ary) distributions, and then analyzing exact and approximate forms via a carefully designed variational program, fibers, and hitting sets, illuminating the combinatorial structure underlying optimal Twenty Questions strategies.
Abstract
In the distributional Twenty Questions game, Bob chooses a number $x$ from $1$ to $n$ according to a distribution $μ$, and Alice (who knows $μ$) attempts to identify $x$ using Yes/No questions, which Bob answers truthfully. Her goal is to minimize the expected number of questions. The optimal strategy for the Twenty Questions game corresponds to a Huffman code for $μ$, yet this strategy could potentially uses all $2^n$ possible questions. Dagan et al. constructed a set of $1.25^{n+o(n)}$ questions which suffice to construct an optimal strategy for all $μ$, and showed that this number is optimal (up to sub-exponential factors) for infinitely many $n$. We determine the optimal size of such a set of questions for all $n$ (up to sub-exponential factors), answering an open question of Dagan et al. In addition, we generalize the results of Dagan et al. to the $d$-ary setting, obtaining similar results with $1.25$ replaced by $1 + (d-1)/d^{d/(d-1)}$.
