Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

Xu Zhang; Kaidi Xu; Ziqing Hu; Ren Wang

Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

Xu Zhang, Kaidi Xu, Ziqing Hu, Ren Wang

TL;DR

This work addresses adversarial robustness in Mixture of Experts (MoE) by revealing that expert networks are more vulnerable than routers and introducing a targeted robust training method (RT-ER) that strengthens the second-top expert via a KL-based regularization term. Building on this, the authors propose a dual-model framework that blends a standard MoE and a robust MoE with a smoothing parameter $\alpha$, and derive certified robustness bounds for both the single MoE and the dual-model. To maximize both robustness and accuracy, they introduce JTDMoE, a bi-level joint training strategy that aligns the standard and robust MoEs. Empirical results on CIFAR-10 and TinyImageNet using ResNet18 and ViT-small demonstrate substantial improvements in robust accuracy with limited loss in standard accuracy, and the code is publicly available.

Abstract

Mixture of Experts (MoE) have shown remarkable success in leveraging specialized expert networks for complex machine learning tasks. However, their susceptibility to adversarial attacks presents a critical challenge for deployment in robust applications. This paper addresses the critical question of how to incorporate robustness into MoEs while maintaining high natural accuracy. We begin by analyzing the vulnerability of MoE components, finding that expert networks are notably more susceptible to adversarial attacks than the router. Based on this insight, we propose a targeted robust training technique that integrates a novel loss function to enhance the adversarial robustness of MoE, requiring only the robustification of one additional expert without compromising training or inference efficiency. Building on this, we introduce a dual-model strategy that linearly combines a standard MoE model with our robustified MoE model using a smoothing parameter. This approach allows for flexible control over the robustness-accuracy trade-off. We further provide theoretical foundations by deriving certified robustness bounds for both the single MoE and the dual-model. To push the boundaries of robustness and accuracy, we propose a novel joint training strategy JTDMoE for the dual-model. This joint training enhances both robustness and accuracy beyond what is achievable with separate models. Experimental results on CIFAR-10 and TinyImageNet datasets using ResNet18 and Vision Transformer (ViT) architectures demonstrate the effectiveness of our proposed methods. The code is publicly available at https://github.com/TIML-Group/Robust-MoE-Dual-Model.

Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

TL;DR

Abstract

Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (4)