One-Bit Distributed Mean Estimation with Unknown Variance
Ritesh Kumar, Shashank Vatedka
TL;DR
This work addresses distributed mean estimation with one-bit communication per user when the variance is unknown, focusing on scale-location families. It designs both non-adaptive and adaptive protocols and shows that a simple two-round adaptive scheme achieves an asymptotic MSE of $\frac{\sigma^2}{4 f_X^2(0)}$, and proves a matching lower bound for a broad class of symmetric log-concave base densities, establishing the optimality of the adaptive approach in many cases. A general lower bound for non-adaptive protocols is derived, revealing a quantifiable gap (via a distribution-dependent constant $T(f_X)$) between adaptive and non-adaptive performance for several distributions, including generalized Gaussians with $\beta<1.85$. Simulations across multiple distributions corroborate the theoretical results, showing a positive advantage of adaptivity in realistic settings and highlighting the potential benefits for privacy-preserving and communication-constrained learning systems.
Abstract
In this work, we study the problem of distributed mean estimation with $1$-bit communication constraints when the variance is unknown. We focus on the specific case where each user has access to one i.i.d. sample drawn from a distribution that belongs to a scale-location family, and is limited to sending just a single bit of information to a central server whose goal is to estimate the mean. We propose simple non-adaptive and adaptive protocols that are shown to be asymptotically normal. We derive bounds on the asymptotic (in the number of users) Mean Squared Error (MSE) achieved by these protocols. For a class of symmetric log-concave distributions, we derive matching lower bounds for the MSE achieved by adaptive protocols, proving the optimality of our scheme. Furthermore, we develop a lower bound on the MSE for non-adaptive protocols that applies to any symmetric strictly log-concave distribution by means of a refined squared Hellinger distance analysis. Through this, we show that for many common distributions including a subclass of the generalized Gaussian family, the asymptotic minimax MSE achieved by the best non-adaptive protocol is higher than that achieved by our simple adaptive protocol. Our simulation results confirm a positive gap between the adaptive and non-adaptive settings, aligning with the theoretical bounds.
