Table of Contents
Fetching ...

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

Shaopeng Fu, Di Wang

TL;DR

This work introduces a neural tangent kernel (NTK) based framework to analyze adversarial training (AT) dynamics in wide deep neural networks. It proves that AT-trained wide networks are well-approximated by their linearizations $f^{\mathrm{lin}}_t$ and derives closed-form AT dynamics under squared loss, revealing a time-dependent regularization $\hat{\Xi}(t)$ that encodes robustness. A key finding is the AT degeneration phenomenon: as training time grows, $\hat{\Xi}(t)$ fades and AT converges to standard training, explaining robust overfitting and motivating early stopping. To mitigate this, the authors propose Adv-NTK, an AT algorithm for infinite-width networks that optimizes the diagonalized regularization and uses PGD with a validation set; experiments on CIFAR-10 and SVHN show Adv-NTK achieves robustness comparable to finite-width AT and, in some cases, better than AT. The results provide both theoretical insight and a practical tool for enhancing robustness in infinite-width DNNs.

Abstract

Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs). However, recent studies empirically demonstrated that it suffers from robust overfitting, i.e., a long time AT can be detrimental to the robustness of DNNs. This paper presents a theoretical explanation of robust overfitting for DNNs. Specifically, we non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. Moreover, for squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon: a long-term AT will result in a wide DNN degenerates to that obtained without AT and thus cause robust overfitting. Based on our theoretical results, we further design a method namely Adv-NTK, the first AT algorithm for infinite-width DNNs. Experiments on real-world datasets show that Adv-NTK can help infinite-width DNNs enhance comparable robustness to that of their finite-width counterparts, which in turn justifies our theoretical findings. The code is available at https://github.com/fshp971/adv-ntk.

Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

TL;DR

This work introduces a neural tangent kernel (NTK) based framework to analyze adversarial training (AT) dynamics in wide deep neural networks. It proves that AT-trained wide networks are well-approximated by their linearizations and derives closed-form AT dynamics under squared loss, revealing a time-dependent regularization that encodes robustness. A key finding is the AT degeneration phenomenon: as training time grows, fades and AT converges to standard training, explaining robust overfitting and motivating early stopping. To mitigate this, the authors propose Adv-NTK, an AT algorithm for infinite-width networks that optimizes the diagonalized regularization and uses PGD with a validation set; experiments on CIFAR-10 and SVHN show Adv-NTK achieves robustness comparable to finite-width AT and, in some cases, better than AT. The results provide both theoretical insight and a practical tool for enhancing robustness in infinite-width DNNs.

Abstract

Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs). However, recent studies empirically demonstrated that it suffers from robust overfitting, i.e., a long time AT can be detrimental to the robustness of DNNs. This paper presents a theoretical explanation of robust overfitting for DNNs. Specifically, we non-trivially extend the neural tangent kernel (NTK) theory to AT and prove that an adversarially trained wide DNN can be well approximated by a linearized DNN. Moreover, for squared loss, closed-form AT dynamics for the linearized DNN can be derived, which reveals a new AT degeneration phenomenon: a long-term AT will result in a wide DNN degenerates to that obtained without AT and thus cause robust overfitting. Based on our theoretical results, we further design a method namely Adv-NTK, the first AT algorithm for infinite-width DNNs. Experiments on real-world datasets show that Adv-NTK can help infinite-width DNNs enhance comparable robustness to that of their finite-width counterparts, which in turn justifies our theoretical findings. The code is available at https://github.com/fshp971/adv-ntk.
Paper Structure (41 sections, 29 theorems, 178 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 41 sections, 29 theorems, 178 equations, 3 figures, 5 tables, 1 algorithm.

Key Result

Theorem 4.1

Suppose $f_0$ is an MLP defined and initialized as in Section sec:adv-train. Then, for any $x, x' \in \mathcal{X}$ we have where $\Theta^{\infty}_{\theta}: \mathcal{X} \times \mathcal{X} \rightarrow \mathbb{R}$ and $\Theta^{\infty}_{x}: \mathcal{X} \times \mathcal{X} \rightarrow \mathbb{R}$ are two deterministic kernel functions.

Figures (3)

  • Figure 1: The robust test accuracy curves of finite-width MLP-5/CNN-5 along AT on CIFAR-10. The robust test accuracy of infinite width DNNs learned by NTK and Adv-NTK are also plotted.
  • Figure 2: The robust test accuracy curves of finite-width MLP-5/CNN-5 along AT on SVHN. The robust test accuracy of infinite width DNNs learned by NTK and Adv-NTK are also plotted.
  • Figure 3: The robust test accuracy curves of ResNet models along AT. The robust test accuracy of infinite width DNNs learned by NTK and Adv-NTK are also plotted.

Theorems & Definitions (66)

  • Theorem 4.1: Kernels limits at initialization; Informal version of Theorem \ref{['thm:conv-init-formal']}
  • Remark 1
  • Theorem 4.2: Equivalence between wide DNN and linearized DNN
  • Remark 2
  • Theorem 4.3: Close-form AT-dynamics of $f^{\mathrm{lin}}_t$ under squared loss
  • Corollary 4.1
  • proof
  • Remark 3
  • Definition A.1: Lipschitz continuity
  • Definition A.2: Lipschitz smoothness
  • ...and 56 more