Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

Francesco Croce; Naman D Singh; Matthias Hein

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

Francesco Croce, Naman D Singh, Matthias Hein

TL;DR

This work studies adversarial robustness in semantic segmentation under the $\ell_\infty$ threat model, a setting made difficult by per-pixel predictions. It introduces three novel losses ($\mathcal{L}_{JS}$, $\mathcal{L}_{MCE}$, $\mathcal{L}_{MCE-Bal}$) and the Segmentation Ensemble Attack (SEA) to provide reliable robustness evaluation, showing that prior attacks overestimate robustness. To enable scalable robust training, it proposes PIR-AT, which initializes backbones with robust ImageNet classifiers, achieving state-of-the-art robustness on Pascal-Voc and ADE20K with substantially reduced training time. Together, these advances offer practical guidance for deploying robust semantic segmentation systems and set a new benchmark for evaluation and training efficiency.

Abstract

Adversarial robustness has been studied extensively in image classification, especially for the $\ell_\infty$-threat model, but significantly less so for related tasks such as object detection and semantic segmentation, where attacks turn out to be a much harder optimization problem than for image classification. We propose several problem-specific novel attacks minimizing different metrics in accuracy and mIoU. The ensemble of our attacks, SEA, shows that existing attacks severely overestimate the robustness of semantic segmentation models. Surprisingly, existing attempts of adversarial training for semantic segmentation models turn out to be weak or even completely non-robust. We investigate why previous adaptations of adversarial training to semantic segmentation failed and show how recently proposed robust ImageNet backbones can be used to obtain adversarially robust semantic segmentation models with up to six times less training time for PASCAL-VOC and the more challenging ADE20k. The associated code and robust models are available at https://github.com/nmndeep/robust-segmentation

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

TL;DR

This work studies adversarial robustness in semantic segmentation under the

threat model, a setting made difficult by per-pixel predictions. It introduces three novel losses (

) and the Segmentation Ensemble Attack (SEA) to provide reliable robustness evaluation, showing that prior attacks overestimate robustness. To enable scalable robust training, it proposes PIR-AT, which initializes backbones with robust ImageNet classifiers, achieving state-of-the-art robustness on Pascal-Voc and ADE20K with substantially reduced training time. Together, these advances offer practical guidance for deploying robust semantic segmentation systems and set a new benchmark for evaluation and training efficiency.

Abstract

Adversarial robustness has been studied extensively in image classification, especially for the

-threat model, but significantly less so for related tasks such as object detection and semantic segmentation, where attacks turn out to be a much harder optimization problem than for image classification. We propose several problem-specific novel attacks minimizing different metrics in accuracy and mIoU. The ensemble of our attacks, SEA, shows that existing attacks severely overestimate the robustness of semantic segmentation models. Surprisingly, existing attempts of adversarial training for semantic segmentation models turn out to be weak or even completely non-robust. We investigate why previous adaptations of adversarial training to semantic segmentation failed and show how recently proposed robust ImageNet backbones can be used to obtain adversarially robust semantic segmentation models with up to six times less training time for PASCAL-VOC and the more challenging ADE20k. The associated code and robust models are available at https://github.com/nmndeep/robust-segmentation

Paper Structure (27 sections, 16 equations, 10 figures, 9 tables)

This paper contains 27 sections, 16 equations, 10 figures, 9 tables.

Introduction
Related Work
Adversarial Attacks for Semantic Segmentation
Setup
How to efficiently attack mIoU
Why do attacks on semantic segmentation require new loss functions compared to image segmentation?
Why does the cross-entropy loss not work for semantic segmentation?
Novel attacks on semantic segmentation
Performance of our single attacks.
Optimization techniques for adversarial attacks on semantic segmentation
Segmentation Ensemble Attack (SEA)
Adversarially Robust Segmentation Models
PIR-AT: robust models via robust initialization
Ablation study of AT vs PIR-AT
Conclusion
...and 12 more sections

Figures (10)

Figure 1: Effect of adversarial attacks on semantic segmentation models. For a validation image of Ade20K (first column, with ground truth mask), we show the image perturbed by targeted $\ell_\infty$-attacks ($\epsilon_\infty=2/255$, target class "grass" or "sky"), and the predicted segmentation. For a clean model the attack completely alters the segmentation map, while our robust model (UPerNet+ ConvNeXt-T trained with 5-step PIR-AT for 128 epochs) is minimally affected. For illustration, we use targeted attacks, and not untargeted ones as in the rest of the paper. More illustrations in \ref{['app:additional-fig']}.
Figure 2: Comparison of const-$\epsilon$ and red-$\epsilon$ optimization schemes. Attack accuracy for the robust PIR-AT UPerNet+ConvNeXt-T model from Table \ref{['tab:comp-losses']} on Pascal-Voc, across different losses for the same iteration budget. The radius reduction (red-$\epsilon$) scheme performs best across all attacks, and it even improves the worst-case over all attacks.
Figure 3: Visualizing adversarial images and their segmentation outputs. We show the perturbed images, corresponding predicted segmentation masks and average accuracy for increasing radii for both the clean and our PIR-AT models. The adversarial images are generated on Pascal-Voc with APGD on $\mathcal{L}_\textrm{Mask-CE}$. For the clean model even at a smaller radii of 0.5/255, the predicted mask deviates from the ground truth significantly. Whereas for the PIR-AT model the predicted mask is similar to the ground truth even at a high perturbation strength of 8/255. More visualizations can be found in \ref{['app:additional-fig']}.
Figure 4: Comparison of const-$\epsilon$- and red-$\epsilon$ optimization schemes for mIoU. Balanced attack accuracy for the robust PIR-AT trained UPerNet + ConvNeXt-T model from \ref{['tab:comp_robust_models_new']} trained on Pascal-Voc, across different losses for the same iteration budget. The radius reduction (red-$\epsilon$) scheme performs best across all losses, and $\epsilon_\infty$ and even the worst-case over all losses improves.
Figure 5: Influence of number of iterations in SEA. We show robust average pixel accuracy (left) and mIoU (right) varying the number of iterations in our attack: 300 iterations give the best compute-effectiveness trade-off. We use the 5 step PIR-AT Pascal-Voc trained ConvNeXt-T backbone UPerNet model and the attack is done for $\ell_\infty=8/255$.
...and 5 more figures

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

TL;DR

Abstract

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

Authors

TL;DR

Abstract

Table of Contents

Figures (10)