Testing vs Estimation for Index-Invariant Properties in the Huge Object Model
Sourav Chakraborty, Eldar Fischer, Arijit Ghosh, Amit Levi, Gopinath Mishra, Sayantan Sen
TL;DR
This work extends property testing in the Huge Object model to index-invariant properties by importing Szemerédi-style regularity ideas into a detailing framework. It shows that for index-invariant properties that admit a constant-Query $\varepsilon$-test, one can construct a distance-estimation algorithm with constant-query complexity, effectively bridging testing and tolerant estimation in this setting. The approach introduces detailings, weight/type distributions, and robust/weakt robustness notions, and builds a simulation-based pipeline (Simulate, Accept-Probability) together with procedures to locate robust detailings and estimate their parameters. The results highlight a sharp separation enabled by invariance: constant-query testability implies constant-query estimability, with controlled (non-tower) parameter dependencies, offering a principled regularity-like framework for Huge Object property testing.
Abstract
The Huge Object model of property testing [Goldreich and Ron, TheoretiCS 23] concerns properties of distributions supported on $\{0,1\}^n$, where $n$ is so large that even reading a single sampled string is unrealistic. Instead, query access is provided to the samples, and the efficiency of the algorithm is measured by the total number of queries that were made to them. Index-invariant properties under this model were defined in [Chakraborty et al., COLT 23], as a compromise between enduring the full intricacies of string testing when considering unconstrained properties, and giving up completely on the string structure when considering label-invariant properties. Index-invariant properties are those that are invariant through a consistent reordering of the bits of the involved strings. Here we provide an adaptation of Szemerédi's regularity method for this setting, and in particular show that if an index-invariant property admits an $ε$-test with a number of queries depending only on the proximity parameter $ε$, then it also admits a distance estimation algorithm whose number of queries depends only on the approximation parameter.
