Improved Bounds for High-Dimensional Equivalence and Product Testing using Subcube Queries
Tomer Adar, Eldar Fischer, Amit Levi
TL;DR
This work advances high-dimensional distribution testing in the subcube conditional model by introducing a quasi-linear in $n$ equivalence tester for two distributions, achieving $ ilde{O}(n/ extvarepsilon^2)$ queries in the binary setting and extending to general alphabets with a logarithmic alphabet-cost factor. It further tightens interval-query bounds to $ ilde{O}(rac{ extlog N}{ extvarepsilon^2})$ and forges a $ ilde{O}(n/ extvarepsilon^2)$-query product tester, accompanied by an $ ilde{ ext{Ω}}(rac{ oot n extsqrt}{ extvarepsilon^2})$ lower bound, underlining the advantage of restricted query families such as prefix and marginal-prefix queries. The results generalize to mixed alphabets and establish reductions that unify equivalence and product testing with conditional subcube access, offering practically efficient testing tools for high-dimensional and structured data settings. Collectively, these contributions substantially reduce query complexity for core distribution testing tasks in practice, particularly in database-like and high-dimensional contexts with restricted query capabilities.
Abstract
We study property testing in the subcube conditional model introduced by Bhattacharyya and Chakraborty (2017). We obtain the first equivalence test for $n$-dimensional distributions that is quasi-linear in $n$, improving the previously known $\tilde{O}(n^2/\varepsilon^2)$ query complexity bound to $\tilde{O}(n/\varepsilon^2)$. We extend this result to general finite alphabets with logarithmic cost in the alphabet size. By exploiting the specific structure of the queries that we use (which are more restrictive than general subcube queries), we obtain a cubic improvement over the best known test for distributions over $\{1,\ldots,N\}$ under the interval querying model of Canonne, Ron and Servedio (2015), attaining a query complexity of $\tilde{O}((\log N)/\varepsilon^2)$, which for fixed $\varepsilon$ almost matches the known lower bound of $Ω((\log N)/\log\log N)$. We also derive a product test for $n$-dimensional distributions with $\tilde{O}(n / \varepsilon^2)$ queries, and provide an $Ω(\sqrt{n} / \varepsilon^2)$ lower bound for this property.
