The Empirical Mean is Minimax Optimal for Local Glivenko-Cantelli
Doron Cohen, Aryeh Kontorovich, Roi Weiss
TL;DR
The paper analyzes Local Glivenko-Cantelli (LGC) for product Bernoulli measures and examines learning with estimators beyond the Empirical Mean Estimator (EME). It proves that, under non-pathological, decaying, and symmetry conditions, the LGC class is the largest learnable family for any estimator, and the EME achieves the minimax rate over such families. It further shows that allowing certain pathologies enables learning larger classes (e.g., union with constant sequences) via a relaxation construction, and provides a conjecture and open problems toward even richer extensions. The approach combines information-theoretic lower bounds (Fano) with constructive estimators and testing-based schemes, yielding both sharp lower bounds and practical estimators with provable consistency. The results clarify fundamental limits of distribution-dependent uniform convergence in high dimensions and illuminate when structure-aware estimators can outperform the standard EME.
Abstract
We revisit the recently introduced Local Glivenko-Cantelli setting, which studies distribution-dependent uniform convergence rates of the Empirical Mean Estimator (EME). In this work, we investigate generalizations of this setting where arbitrary estimators are allowed rather than just the EME. Can a strictly larger class of measures be learned? Can better risk decay rates be obtained? We provide exhaustive answers to these questions, which are both negative, provided the learner is barred from exploiting some infinite-dimensional pathologies. On the other hand, allowing such exploits does lead to a strictly larger class of learnable measures.
