Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

Shaopeng Fu; Di Wang

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

Shaopeng Fu, Di Wang

TL;DR

This work introduces a neural tangent kernel (NTK) based framework to analyze adversarial training (AT) dynamics in wide deep neural networks. It proves that AT-trained wide networks are well-approximated by their linearizations $f^{\mathrm{lin}}_t$ and derives closed-form AT dynamics under squared loss, revealing a time-dependent regularization $\hat{\Xi}(t)$ that encodes robustness. A key finding is the AT degeneration phenomenon: as training time grows, $\hat{\Xi}(t)$ fades and AT converges to standard training, explaining robust overfitting and motivating early stopping. To mitigate this, the authors propose Adv-NTK, an AT algorithm for infinite-width networks that optimizes the diagonalized regularization and uses PGD with a validation set; experiments on CIFAR-10 and SVHN show Adv-NTK achieves robustness comparable to finite-width AT and, in some cases, better than AT. The results provide both theoretical insight and a practical tool for enhancing robustness in infinite-width DNNs.

Abstract

Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs). However, recent studies empirically demonstrated that it suffers from robust overfitting, i.e., a long time AT can be detrimental to the robustness of DNNs. This paper presents a theoretical explanation of robust overfitting for DNNs. Specifically, we non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. Moreover, for squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon: a long-term AT will result in a wide DNN degenerates to that obtained without AT and thus cause robust overfitting. Based on our theoretical results, we further design a method namely Adv-NTK, the first AT algorithm for infinite-width DNNs. Experiments on real-world datasets show that Adv-NTK can help infinite-width DNNs enhance comparable robustness to that of their finite-width counterparts, which in turn justifies our theoretical findings. The code is available at https://github.com/fshp971/adv-ntk.

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

TL;DR

and derives closed-form AT dynamics under squared loss, revealing a time-dependent regularization

that encodes robustness. A key finding is the AT degeneration phenomenon: as training time grows,

fades and AT converges to standard training, explaining robust overfitting and motivating early stopping. To mitigate this, the authors propose Adv-NTK, an AT algorithm for infinite-width networks that optimizes the diagonalized regularization and uses PGD with a validation set; experiments on CIFAR-10 and SVHN show Adv-NTK achieves robustness comparable to finite-width AT and, in some cases, better than AT. The results provide both theoretical insight and a practical tool for enhancing robustness in infinite-width DNNs.

Abstract

Paper Structure (41 sections, 29 theorems, 178 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 41 sections, 29 theorems, 178 equations, 3 figures, 5 tables, 1 algorithm.

Introduction
Related Works
Preliminaries
Adversarial Training of Wide DNNs
Gradient Flow-based Adversarial Example Search
Adversarial Training Dynamics
Adversarial Training in Infinite-Width
Robust Overfitting in Wide DNNs
AT Degeneration Leads to Robust Overfitting
Infinite Width Adversarial Training
Empirical Analysis of Adv-NTK
Conclusions
Preliminaries
Additional Assumptions and Notations
Notations
...and 26 more sections

Key Result

Theorem 4.1

Suppose $f_0$ is an MLP defined and initialized as in Section sec:adv-train. Then, for any $x, x' \in \mathcal{X}$ we have where $\Theta^{\infty}_{\theta}: \mathcal{X} \times \mathcal{X} \rightarrow \mathbb{R}$ and $\Theta^{\infty}_{x}: \mathcal{X} \times \mathcal{X} \rightarrow \mathbb{R}$ are two deterministic kernel functions.

Figures (3)

Figure 1: The robust test accuracy curves of finite-width MLP-5/CNN-5 along AT on CIFAR-10. The robust test accuracy of infinite width DNNs learned by NTK and Adv-NTK are also plotted.
Figure 2: The robust test accuracy curves of finite-width MLP-5/CNN-5 along AT on SVHN. The robust test accuracy of infinite width DNNs learned by NTK and Adv-NTK are also plotted.
Figure 3: The robust test accuracy curves of ResNet models along AT. The robust test accuracy of infinite width DNNs learned by NTK and Adv-NTK are also plotted.

Theorems & Definitions (66)

Theorem 4.1: Kernels limits at initialization; Informal version of Theorem \ref{['thm:conv-init-formal']}
Remark 1
Theorem 4.2: Equivalence between wide DNN and linearized DNN
Remark 2
Theorem 4.3: Close-form AT-dynamics of $f^{\mathrm{lin}}_t$ under squared loss
Corollary 4.1
proof
Remark 3
Definition A.1: Lipschitz continuity
Definition A.2: Lipschitz smoothness
...and 56 more

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

TL;DR

Abstract

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (66)