Utilizing Class Separation Distance for the Evaluation of Corruption Robustness of Machine Learning Classifiers
Georg Siedel, Silvia Vock, Andrey Morozov, Stefan Voß
TL;DR
This work addresses the problem of evaluating corruption robustness in ML classifiers with a dataset-aware, interpretable metric. It introduces the Minimal Separation Corruption Robustness (MSCR) metric, which uses a dataset-derived corruption distance $\epsilon_{min} = \tfrac{1}{2} \min_{i,j: y_i \neq y_j} dist(x_i, x_j)$ and measures robustness by augmenting test data with uniform noise up to $\epsilon_{min}$, yielding $MSCR = \frac{Acc_{rob-\epsilon_{min}} - Acc_{clean}}{Acc_{clean}}$. The authors validate MSCR on 2D synthetic datasets and CIFAR-10, analyzing multiple models (RF, 1NN, WideResNet) and varying $\epsilon_{train}$ and $\epsilon_{test}$, showing that higher training noise can improve both robustness and accuracy and that the traditional accuracy-robustness tradeoff is not universal. They demonstrate that optimal robustness does not necessarily align with the same level of corruption used during training, and that simple augmentation can provide meaningful gains, with MSCR offering a concrete, dataset-specific interpretive measure for risk assessment. Overall, MSCR provides a practical benchmark for comparing corruption robustness without requiring predefined corruption models, supporting safer deployment of ML systems in real-world settings.
Abstract
Robustness is a fundamental pillar of Machine Learning (ML) classifiers, substantially determining their reliability. Methods for assessing classifier robustness are therefore essential. In this work, we address the challenge of evaluating corruption robustness in a way that allows comparability and interpretability on a given dataset. We propose a test data augmentation method that uses a robustness distance $ε$ derived from the datasets minimal class separation distance. The resulting MSCR (minimal separation corruption robustness) metric allows a dataset-specific comparison of different classifiers with respect to their corruption robustness. The MSCR value is interpretable, as it represents the classifiers avoidable loss of accuracy due to statistical corruptions. On 2D and image data, we show that the metric reflects different levels of classifier robustness. Furthermore, we observe unexpected optima in classifiers robust accuracy through training and testing classifiers with different levels of noise. While researchers have frequently reported on a significant tradeoff on accuracy when training robust models, we strengthen the view that a tradeoff between accuracy and corruption robustness is not inherent. Our results indicate that robustness training through simple data augmentation can already slightly improve accuracy.
