GROS: A General Robust Aggregation Strategy

Alejandro Cholaquidis; Emilien Joly; Leonardo Moreno

GROS: A General Robust Aggregation Strategy

Alejandro Cholaquidis, Emilien Joly, Leonardo Moreno

Abstract

A new, very general, robust procedure for combining estimators in metric spaces is introduced GROS. The method is reminiscent of the well-known median of means, as described in \cite{devroye2016sub}. Initially, the sample is divided into $K$ groups. Subsequently, an estimator is computed for each group. Finally, these $K$ estimators are combined using a robust procedure. We prove that this estimator is sub-Gaussian and we get its break-down point, in the sense of Donoho. The robust procedure involves a minimization problem on a general metric space, but we show that the same (up to a constant) sub-Gaussianity is obtained if the minimization is taken over the sample, making GROS feasible in practice. The performance of GROS is evaluated through five simulation studies: the first one focuses on classification using $k$-means, the second one on the multi-armed bandit problem, the third one on the regression problem. The fourth one is the set estimation problem under a noisy model. Lastly, we apply GROS to get a robust persistent diagram.

GROS: A General Robust Aggregation Strategy

Abstract

groups. Subsequently, an estimator is computed for each group. Finally, these

estimators are combined using a robust procedure. We prove that this estimator is sub-Gaussian and we get its break-down point, in the sense of Donoho. The robust procedure involves a minimization problem on a general metric space, but we show that the same (up to a constant) sub-Gaussianity is obtained if the minimization is taken over the sample, making GROS feasible in practice. The performance of GROS is evaluated through five simulation studies: the first one focuses on classification using

-means, the second one on the multi-armed bandit problem, the third one on the regression problem. The fourth one is the set estimation problem under a noisy model. Lastly, we apply GROS to get a robust persistent diagram.

Paper Structure (17 sections, 6 theorems, 34 equations, 10 figures)

This paper contains 17 sections, 6 theorems, 34 equations, 10 figures.

Introduction
Robust aggregation of weakly convergent estimators
Mis-specification of the set $\mathcal{M}$
Computational aspects
Some applications of GROS
Classification by $k$-means
Simulations
Bandits
Heavy-tailed bandits
Simulations
Robust regression
Simulations
Robust set estimation
Simulations
Robust persistent diagram
...and 2 more sections

Key Result

Lemma 1

Assume that there exist an $\eta \in \mathcal{M}$ and an $I\subset [K]$ with $|I|>K/2$ such that for all $j\in I$, $d(\mu_j,\eta)\le t$. Then, $d(\mu^*,\eta)\le 2t$.

Figures (10)

Figure 1: Simulation of 1000 observations of the multivariate Student mixture \ref{['student']}. Observations are colored according to the component of the mixture which the data comes from.
Figure 2: Box plot of classification errors, according to \ref{['error']}, of $K$-means, TClust, PAM and RobustKM over $1000$ replicates.
Figure 3: Cumulative gains over 500 replications, for $t=1, \ldots,750$. The red dotted horizontal line ($y=8$) is the maximum expected gain. The black dotted vertical line ($x=40$) indicates the number of random warm-up runs in the RUCB algorithm. The dashed lines depict the mean reward of the UCB (orange) and RUCB (blue) algorithms.
Figure 4: Box plot of classification errors (according to L2 distance) in $1000$ replicates. The different scenarios are obtained in the skew-normal Student distribution with $\sigma \in \{ 9,16\}$ and $\xi \in \{ 1,9\}$, fixed $\nu=3$ and $\kappa=0$.
Figure 5: Regression functions estimated with the RANW (orange), NW (black), ONL (light blue) and SBMB (blue) in one replicate. The true function is shown in red. The different scenarios are obtained in the skew-normal Student distribution with $\sigma \in \{ 9,16\}$ and $\xi \in \{ 1,9\}$, fixed $\mu=0$ and $\nu=3$.
...and 5 more figures

Theorems & Definitions (13)

Remark
Definition 1
Lemma 1
proof
Theorem 2
Remark
Lemma 3
proof
Corollary 2.1
Lemma 4
...and 3 more

GROS: A General Robust Aggregation Strategy

Abstract

GROS: A General Robust Aggregation Strategy

Authors

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (13)