Robust Distribution Learning with Local and Global Adversarial Corruptions

Sloan Nietert; Ziv Goldfeld; Soroosh Shafiee

Robust Distribution Learning with Local and Global Adversarial Corruptions

Sloan Nietert, Ziv Goldfeld, Soroosh Shafiee

TL;DR

An efficient finite-sample algorithm with error bounded by $\sqrt{\varepsilon k} + \rho + \tilde{O}(d\sqrt{k}n^{-1/(k \lor 2)})$ when $P$ has bounded covariance is developed.

Abstract

We consider learning in an adversarial environment, where an $\varepsilon$-fraction of samples from a distribution $P$ are arbitrarily modified (global corruptions) and the remaining perturbations have average magnitude bounded by $ρ$ (local corruptions). Given access to $n$ such corrupted samples, we seek a computationally efficient estimator $\hat{P}_n$ that minimizes the Wasserstein distance $\mathsf{W}_1(\hat{P}_n,P)$. In fact, we attack the fine-grained task of minimizing $\mathsf{W}_1(Π_\# \hat{P}_n, Π_\# P)$ for all orthogonal projections $Π\in \mathbb{R}^{d \times d}$, with performance scaling with $\mathrm{rank}(Π) = k$. This allows us to account simultaneously for mean estimation ($k=1$), distribution estimation ($k=d$), as well as the settings interpolating between these two extremes. We characterize the optimal population-limit risk for this task and then develop an efficient finite-sample algorithm with error bounded by $\sqrt{\varepsilon k} + ρ+ \tilde{O}(d\sqrt{k}n^{-1/(k \lor 2)})$ when $P$ has bounded covariance. This guarantee holds uniformly in $k$ and is minimax optimal up to the sub-optimality of the plug-in estimator when $ρ= \varepsilon = 0$. Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator. We apply this algorithm to robust stochastic optimization, and, in the process, uncover a new method for overcoming the curse of dimensionality in Wasserstein distributionally robust optimization.

Robust Distribution Learning with Local and Global Adversarial Corruptions

TL;DR

An efficient finite-sample algorithm with error bounded by

when

has bounded covariance is developed.

Abstract

We consider learning in an adversarial environment, where an

-fraction of samples from a distribution

are arbitrarily modified (global corruptions) and the remaining perturbations have average magnitude bounded by

(local corruptions). Given access to

such corrupted samples, we seek a computationally efficient estimator

that minimizes the Wasserstein distance

. In fact, we attack the fine-grained task of minimizing

for all orthogonal projections

, with performance scaling with

. This allows us to account simultaneously for mean estimation (

), distribution estimation (

), as well as the settings interpolating between these two extremes. We characterize the optimal population-limit risk for this task and then develop an efficient finite-sample algorithm with error bounded by

when

has bounded covariance. This guarantee holds uniformly in

and is minimax optimal up to the sub-optimality of the plug-in estimator when

. Our efficient procedure relies on a novel trace norm approximation of an ideal yet intractable 2-Wasserstein projection estimator. We apply this algorithm to robust stochastic optimization, and, in the process, uncover a new method for overcoming the curse of dimensionality in Wasserstein distributionally robust optimization.

Paper Structure