Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models
Francesco Croce, Naman D Singh, Matthias Hein
TL;DR
This work studies adversarial robustness in semantic segmentation under the $\ell_\infty$ threat model, a setting made difficult by per-pixel predictions. It introduces three novel losses ($\mathcal{L}_{JS}$, $\mathcal{L}_{MCE}$, $\mathcal{L}_{MCE-Bal}$) and the Segmentation Ensemble Attack (SEA) to provide reliable robustness evaluation, showing that prior attacks overestimate robustness. To enable scalable robust training, it proposes PIR-AT, which initializes backbones with robust ImageNet classifiers, achieving state-of-the-art robustness on Pascal-Voc and ADE20K with substantially reduced training time. Together, these advances offer practical guidance for deploying robust semantic segmentation systems and set a new benchmark for evaluation and training efficiency.
Abstract
Adversarial robustness has been studied extensively in image classification, especially for the $\ell_\infty$-threat model, but significantly less so for related tasks such as object detection and semantic segmentation, where attacks turn out to be a much harder optimization problem than for image classification. We propose several problem-specific novel attacks minimizing different metrics in accuracy and mIoU. The ensemble of our attacks, SEA, shows that existing attacks severely overestimate the robustness of semantic segmentation models. Surprisingly, existing attempts of adversarial training for semantic segmentation models turn out to be weak or even completely non-robust. We investigate why previous adaptations of adversarial training to semantic segmentation failed and show how recently proposed robust ImageNet backbones can be used to obtain adversarially robust semantic segmentation models with up to six times less training time for PASCAL-VOC and the more challenging ADE20k. The associated code and robust models are available at https://github.com/nmndeep/robust-segmentation
