On the Optimality of the Median-of-Means Estimator under Adversarial Contamination

Xabier de Juan; Santiago Mazuelas

On the Optimality of the Median-of-Means Estimator under Adversarial Contamination

Xabier de Juan, Santiago Mazuelas

TL;DR

This work characterizes the optimality of the Median-of-Means estimator under adversarial contamination across multiple distribution classes. It proves minimax optimality for finite-variance and infinite-variance-with-finite-absolute-(1+r)-th-moment classes, and shows that MoM achieves favorable bounds in sub-exponential and sub-Gaussian regimes under appropriate block-structure choices. The results also identify a lower bound that prevents further improvement beyond a $\sqrt{\alpha}$ bias in certain general distributions, and show MoM excels for symmetric distributions, while it is sub-optimal for light-tailed tails. Overall, the paper provides a complete picture of when MoM is most effective under contamination and clarifies its limitations compared to other robust estimators.

Abstract

The Median-of-Means (MoM) is a robust estimator widely used in machine learning that is known to be (minimax) optimal in scenarios where samples are i.i.d. In more grave scenarios, samples are contaminated by an adversary that can inspect and modify the data. Previous work has theoretically shown the suitability of the MoM estimator in certain contaminated settings. However, the (minimax) optimality of MoM and its limitations under adversarial contamination remain unknown beyond the Gaussian case. In this paper, we present upper and lower bounds for the error of MoM under adversarial contamination for multiple classes of distributions. In particular, we show that MoM is (minimax) optimal in the class of distributions with finite variance, as well as in the class of distributions with infinite variance and finite absolute $(1+r)$-th moment. We also provide lower bounds for MoM's error that match the order of the presented upper bounds, and show that MoM is sub-optimal for light-tailed distributions.

On the Optimality of the Median-of-Means Estimator under Adversarial Contamination

TL;DR

Abstract

On the Optimality of the Median-of-Means Estimator under Adversarial Contamination

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (36)