Table of Contents
Fetching ...

One-Bit Distributed Mean Estimation with Unknown Variance

Ritesh Kumar, Shashank Vatedka

TL;DR

This work addresses distributed mean estimation with one-bit communication per user when the variance is unknown, focusing on scale-location families. It designs both non-adaptive and adaptive protocols and shows that a simple two-round adaptive scheme achieves an asymptotic MSE of $\frac{\sigma^2}{4 f_X^2(0)}$, and proves a matching lower bound for a broad class of symmetric log-concave base densities, establishing the optimality of the adaptive approach in many cases. A general lower bound for non-adaptive protocols is derived, revealing a quantifiable gap (via a distribution-dependent constant $T(f_X)$) between adaptive and non-adaptive performance for several distributions, including generalized Gaussians with $\beta<1.85$. Simulations across multiple distributions corroborate the theoretical results, showing a positive advantage of adaptivity in realistic settings and highlighting the potential benefits for privacy-preserving and communication-constrained learning systems.

Abstract

In this work, we study the problem of distributed mean estimation with $1$-bit communication constraints when the variance is unknown. We focus on the specific case where each user has access to one i.i.d. sample drawn from a distribution that belongs to a scale-location family, and is limited to sending just a single bit of information to a central server whose goal is to estimate the mean. We propose simple non-adaptive and adaptive protocols that are shown to be asymptotically normal. We derive bounds on the asymptotic (in the number of users) Mean Squared Error (MSE) achieved by these protocols. For a class of symmetric log-concave distributions, we derive matching lower bounds for the MSE achieved by adaptive protocols, proving the optimality of our scheme. Furthermore, we develop a lower bound on the MSE for non-adaptive protocols that applies to any symmetric strictly log-concave distribution by means of a refined squared Hellinger distance analysis. Through this, we show that for many common distributions including a subclass of the generalized Gaussian family, the asymptotic minimax MSE achieved by the best non-adaptive protocol is higher than that achieved by our simple adaptive protocol. Our simulation results confirm a positive gap between the adaptive and non-adaptive settings, aligning with the theoretical bounds.

One-Bit Distributed Mean Estimation with Unknown Variance

TL;DR

This work addresses distributed mean estimation with one-bit communication per user when the variance is unknown, focusing on scale-location families. It designs both non-adaptive and adaptive protocols and shows that a simple two-round adaptive scheme achieves an asymptotic MSE of , and proves a matching lower bound for a broad class of symmetric log-concave base densities, establishing the optimality of the adaptive approach in many cases. A general lower bound for non-adaptive protocols is derived, revealing a quantifiable gap (via a distribution-dependent constant ) between adaptive and non-adaptive performance for several distributions, including generalized Gaussians with . Simulations across multiple distributions corroborate the theoretical results, showing a positive advantage of adaptivity in realistic settings and highlighting the potential benefits for privacy-preserving and communication-constrained learning systems.

Abstract

In this work, we study the problem of distributed mean estimation with -bit communication constraints when the variance is unknown. We focus on the specific case where each user has access to one i.i.d. sample drawn from a distribution that belongs to a scale-location family, and is limited to sending just a single bit of information to a central server whose goal is to estimate the mean. We propose simple non-adaptive and adaptive protocols that are shown to be asymptotically normal. We derive bounds on the asymptotic (in the number of users) Mean Squared Error (MSE) achieved by these protocols. For a class of symmetric log-concave distributions, we derive matching lower bounds for the MSE achieved by adaptive protocols, proving the optimality of our scheme. Furthermore, we develop a lower bound on the MSE for non-adaptive protocols that applies to any symmetric strictly log-concave distribution by means of a refined squared Hellinger distance analysis. Through this, we show that for many common distributions including a subclass of the generalized Gaussian family, the asymptotic minimax MSE achieved by the best non-adaptive protocol is higher than that achieved by our simple adaptive protocol. Our simulation results confirm a positive gap between the adaptive and non-adaptive settings, aligning with the theoretical bounds.

Paper Structure

This paper contains 36 sections, 15 theorems, 237 equations, 10 figures, 2 tables.

Key Result

Theorem 3.1

For every $\mu \in \mathbb R$ and $\sigma>0$, the protocol in Sec. sec:nonadaptivescheme is strongly consistent, asymptotically normal, and satisfies where for $i=1,2$,

Figures (10)

  • Figure 1: Illustration of our non-adaptive protocol. We partition the $n$ users into subsets of $n_1$ and $n_2$ users. Each user independently encodes their sample using a $1$-bit quantizer, with threshold $\theta_1$ for the first $n_1$ users and $\theta_2$ for the remaining $n_2$ users.
  • Figure 2: Our two-round adaptive protocol: The first $n_1 + n_2$ users use the non-adaptive scheme to produce coarse estimates $\hat{\mu}_c, \hat{\sigma}_c$. The value of $\hat{\mu}_c$ is broadcast to the remaining $n_3$ users, who then send 1-bit messages using $\hat{\mu}_c$ as their quantization threshold.
  • Figure 3: Illustration of bounds on $\frac{1}{\sigma^2}\lim_{n\to\infty}\mathrm{MSE}(\hat{\mu})$ for the generalized Gaussian family parameterized by $\beta>1$. Here, $C_{\text{non}}$ denotes the lower bound on this quantity for non-adaptive protocols, whereas $C_{\text{adapt}}$ denotes the corresponding upper bound for our simple adaptive protocol. As $\beta$ increases, $C_{\mathrm{non}}$ decreases while $C_{\mathrm{adapt}}$ increases. The curves intersect at $\beta\approx 1.8488$. This allows us to conclude that non-adaptive protocols are provably suboptimal compared to adaptive protocols for $\beta<1.8488$.
  • Figure 4: Ratio of constants $C_{\mathrm{non}}/C_{\mathrm{adapt}}$ for the unit-variance Generalized Gaussian Distribution (GGD) as a function of the shape parameter $\beta \in[1.1,1.85)$. We compute $C_{\mathrm{non}}=\tfrac{0.1034}{T(f_X)}$ (non-adaptive lower-bound constant) and $C_{\mathrm{adapt}}=\tfrac{1}{4f_X(0)^2}$ (adaptive constant) from the unit-variance density $f_X$. The curves cross at $\beta^\star \approx 1.85$. For $\beta<\beta^\star$ the non-adaptive lower bound exceeds the adaptive constant (ratio $>1$), while for $\beta>\beta^\star$ the adaptive constant is larger (ratio $<1$).
  • Figure 5: Worst-case and average (over $\mu$) MSE across four source distributions under one-bit protocols. Benchmarks are computed using $f_X(0)$ for each family; adaptive and non-adaptive curves are shown according to the legend. The curve labeled Asymptotic (Non-adaptive) represents the asymptotic value of $\mathrm{MSE}$ predicted by Theorem \ref{['thm:mean_varinace_estimation']}.
  • ...and 5 more figures

Theorems & Definitions (20)

  • Definition 2.1
  • Theorem 3.1
  • Theorem 3.2
  • proof
  • Theorem 3.3
  • Theorem 4.1
  • Remark 4.2
  • Theorem 4.3
  • Theorem B.1
  • Lemma B.2
  • ...and 10 more