Towards Adversarial Robustness of Model-Level Mixture-of-Experts Architectures for Semantic Segmentation
Svetlana Pavlitska, Enrico Eisen, J. Marius Zöllner
TL;DR
This work investigates the adversarial robustness of model-level mixtures of experts (MoEs) for semantic segmentation in urban driving scenes. It compares two expert architectures (urban and highway) with four MoE variants that differ in gating strategy and an optional post-weight convolution, under white-box per-instance and universal FGSM/PGD attacks and transfer settings. The findings show that MoEs generally yield smaller performance drops than ensembles, with the best robustness achieved by a classwise gate combined with an extra convolutional layer, especially in DeepLabv3+-based backbones. The results highlight the practical potential of MoEs to enhance reliability of segmentation in adversarial environments and provide code for replication.
Abstract
Vulnerability to adversarial attacks is a well-known deficiency of deep neural networks. Larger networks are generally more robust, and ensembling is one method to increase adversarial robustness: each model's weaknesses are compensated by the strengths of others. While an ensemble uses a deterministic rule to combine model outputs, a mixture of experts (MoE) includes an additional learnable gating component that predicts weights for the outputs of the expert models, thus determining their contributions to the final prediction. MoEs have been shown to outperform ensembles on specific tasks, yet their susceptibility to adversarial attacks has not been studied yet. In this work, we evaluate the adversarial vulnerability of MoEs for semantic segmentation of urban and highway traffic scenes. We show that MoEs are, in most cases, more robust to per-instance and universal white-box adversarial attacks and can better withstand transfer attacks. Our code is available at \url{https://github.com/KASTEL-MobilityLab/mixtures-of-experts/}.
