Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach
Shaopeng Fu, Di Wang
TL;DR
This work introduces a neural tangent kernel (NTK) based framework to analyze adversarial training (AT) dynamics in wide deep neural networks. It proves that AT-trained wide networks are well-approximated by their linearizations $f^{\mathrm{lin}}_t$ and derives closed-form AT dynamics under squared loss, revealing a time-dependent regularization $\hat{\Xi}(t)$ that encodes robustness. A key finding is the AT degeneration phenomenon: as training time grows, $\hat{\Xi}(t)$ fades and AT converges to standard training, explaining robust overfitting and motivating early stopping. To mitigate this, the authors propose Adv-NTK, an AT algorithm for infinite-width networks that optimizes the diagonalized regularization and uses PGD with a validation set; experiments on CIFAR-10 and SVHN show Adv-NTK achieves robustness comparable to finite-width AT and, in some cases, better than AT. The results provide both theoretical insight and a practical tool for enhancing robustness in infinite-width DNNs.
Abstract
Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs). However, recent studies empirically demonstrated that it suffers from robust overfitting, i.e., a long time AT can be detrimental to the robustness of DNNs. This paper presents a theoretical explanation of robust overfitting for DNNs. Specifically, we non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. Moreover, for squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon: a long-term AT will result in a wide DNN degenerates to that obtained without AT and thus cause robust overfitting. Based on our theoretical results, we further design a method namely Adv-NTK, the first AT algorithm for infinite-width DNNs. Experiments on real-world datasets show that Adv-NTK can help infinite-width DNNs enhance comparable robustness to that of their finite-width counterparts, which in turn justifies our theoretical findings. The code is available at https://github.com/fshp971/adv-ntk.
