Parameter Interpolation Adversarial Training for Robust Image Classification

Xin Liu; Yichen Yang; Kun He; John E. Hopcroft

Parameter Interpolation Adversarial Training for Robust Image Classification

Xin Liu, Yichen Yang, Kun He, John E. Hopcroft

TL;DR

The paper tackles robustness oscillations and overfitting in adversarial training by introducing Parameter Interpolation Adversarial Training (PIAT), which linearly interpolates model parameters between successive epochs with a dynamically increasing weight. It couples this with NMSE, a logit-normalization-based regularizer that aligns relative logit magnitudes between clean and adversarial examples, and combines them in a single loss L = L_CE + μ L_NMSE. Across CNNs and Vision Transformers on CIFAR-10/100 and SVHN, PIAT+NMSE consistently improves robust accuracy under strong attacks (PGD20, CW, AutoAttack) while preserving clean accuracy, outperforming several baselines. The approach is validated via toy experiments, theoretical insights (Theorem 3.1 on interpolation stability), extensive ablations, and loss-landscape analyses, highlighting its stability and generality. The method offers a practical, adaptable route to more reliable adversarial defenses without altering the core AT framework.

Abstract

Though deep neural networks exhibit superior performance on various tasks, they are still plagued by adversarial examples. Adversarial training has been demonstrated to be the most effective method to defend against adversarial attacks. However, existing adversarial training methods show that the model robustness has apparent oscillations and overfitting issues in the training process, degrading the defense efficacy. To address these issues, we propose a novel framework called Parameter Interpolation Adversarial Training (PIAT). PIAT tunes the model parameters between each epoch by interpolating the parameters of the previous and current epochs. It makes the decision boundary of model change more moderate and alleviates the overfitting issue, helping the model converge better and achieving higher model robustness. In addition, we suggest using the Normalized Mean Square Error (NMSE) to further improve the robustness by aligning the relative magnitude of logits between clean and adversarial examples rather than the absolute magnitude. Extensive experiments conducted on several benchmark datasets demonstrate that our framework could prominently improve the robustness of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).

Parameter Interpolation Adversarial Training for Robust Image Classification

TL;DR

Abstract

Parameter Interpolation Adversarial Training for Robust Image Classification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)