Table of Contents
Fetching ...

Adversarial generalization of unfolding (model-based) networks

Vicky Kouni

TL;DR

This work tackles adversarial generalization of model-based unfolding networks for compressed sensing by coupling a Lipschitz analysis of FGSM-attacked final decoders with adversarial Rademacher complexity (ARC) to derive generalization bounds that depend on overparameterization $N$, depth $L$, and attack level $\\varepsilon$. Using ADMM-DAD with an overcomplete sparsifier $W$, the authors prove Lipschitz continuity of the perturbed decoder in $W$ and obtain a tight ARC-based bound that scales with $\\sqrt{NL\\log(\\varepsilon)}/\\sqrt{s}$ up to constants, plus a term for failure probability. The theory is corroborated by experiments on CIFAR-10 and SVHN, showing that increasing the overcomplete representation improves robustness while the empirical adversarial generalization error tracks the predicted scaling. The results provide practical design guidance for robust, interpretable unfolding networks in safety-critical inverse problems such as CS-MRI, linking architectural overparameterization to adversarial resilience.

Abstract

Unfolding networks are interpretable networks emerging from iterative algorithms, incorporate prior knowledge of data structure, and are designed to solve inverse problems like compressed sensing, which deals with recovering data from noisy, missing observations. Compressed sensing finds applications in critical domains, from medical imaging to cryptography, where adversarial robustness is crucial to prevent catastrophic failures. However, a solid theoretical understanding of the performance of unfolding networks in the presence of adversarial attacks is still in its infancy. In this paper, we study the adversarial generalization of unfolding networks when perturbed with $l_2$-norm constrained attacks, generated by the fast gradient sign method. Particularly, we choose a family of state-of-the-art overaparameterized unfolding networks and deploy a new framework to estimate their adversarial Rademacher complexity. Given this estimate, we provide adversarial generalization error bounds for the networks under study, which are tight with respect to the attack level. To our knowledge, this is the first theoretical analysis on the adversarial generalization of unfolding networks. We further present a series of experiments on real-world data, with results corroborating our derived theory, consistently for all data. Finally, we observe that the family's overparameterization can be exploited to promote adversarial robustness, shedding light on how to efficiently robustify neural networks.

Adversarial generalization of unfolding (model-based) networks

TL;DR

This work tackles adversarial generalization of model-based unfolding networks for compressed sensing by coupling a Lipschitz analysis of FGSM-attacked final decoders with adversarial Rademacher complexity (ARC) to derive generalization bounds that depend on overparameterization , depth , and attack level . Using ADMM-DAD with an overcomplete sparsifier , the authors prove Lipschitz continuity of the perturbed decoder in and obtain a tight ARC-based bound that scales with up to constants, plus a term for failure probability. The theory is corroborated by experiments on CIFAR-10 and SVHN, showing that increasing the overcomplete representation improves robustness while the empirical adversarial generalization error tracks the predicted scaling. The results provide practical design guidance for robust, interpretable unfolding networks in safety-critical inverse problems such as CS-MRI, linking architectural overparameterization to adversarial resilience.

Abstract

Unfolding networks are interpretable networks emerging from iterative algorithms, incorporate prior knowledge of data structure, and are designed to solve inverse problems like compressed sensing, which deals with recovering data from noisy, missing observations. Compressed sensing finds applications in critical domains, from medical imaging to cryptography, where adversarial robustness is crucial to prevent catastrophic failures. However, a solid theoretical understanding of the performance of unfolding networks in the presence of adversarial attacks is still in its infancy. In this paper, we study the adversarial generalization of unfolding networks when perturbed with -norm constrained attacks, generated by the fast gradient sign method. Particularly, we choose a family of state-of-the-art overaparameterized unfolding networks and deploy a new framework to estimate their adversarial Rademacher complexity. Given this estimate, we provide adversarial generalization error bounds for the networks under study, which are tight with respect to the attack level. To our knowledge, this is the first theoretical analysis on the adversarial generalization of unfolding networks. We further present a series of experiments on real-world data, with results corroborating our derived theory, consistently for all data. Finally, we observe that the family's overparameterization can be exploited to promote adversarial robustness, shedding light on how to efficiently robustify neural networks.

Paper Structure

This paper contains 24 sections, 14 theorems, 76 equations, 4 figures, 2 tables.

Key Result

Theorem 3

For $L\geq2$ being the total number of layers, let $\widetilde{\mathcal{H}}^L$ be the adversarial hypothesis class defined in advhypo1 and $\delta$ adversarial attack generated by the FGSM, with $\|\delta\|_2\leq\varepsilon$, for attack level $\varepsilon>0$. Assume there exist pair-samples $\{(x_i, with $\mathrm{Lip}_h^{L,\varepsilon}$ -- defined in Theorem lipdec -- being the Lipschitz constant

Figures (4)

  • Figure 1: Performance of ADMM-DAD (plotted on logarithmic scale) on CIFAR10 (left) and SVHN (right), for varying number of layers $L$ and attack levels $\varepsilon$ of the FGSM, and overcompleteness $N=10n$. Top: clean test MSE \ref{['cleantestmse']} and adversarial test MSE \ref{['advtestmse']}. Bottom: adversarial EGE \ref{['genmse']}. For both datasets, \ref{['genmse']} increases as Theorem \ref{['gengentheorem']} suggests, and in fact scales at the rate dictated by Corollary \ref{['asymptoticL']}, thus confirming our derived generalization theory. A similar increment is observed for both \ref{['cleantestmse']} and \ref{['advtestmse']}, but at a reasonable rate, thereby highlighting the adversarial robustness of ADMM-DAD.
  • Figure 2: Performance of ADMM-DAD (plotted on logarithmic scale) on CIFAR10 (left) and SVHN (right), for varying number of layers $L$ and attack levels $\varepsilon$ of the PGD (10 iterations), and overcompleteness $N=10n$. Top: clean test MSE \ref{['cleantestmse']} and adversarial test MSE \ref{['advtestmse']}. Bottom: adversarial EGE \ref{['genmse']}. Although our theoretical analysis focuses on FGSM, we observe that even for a stronger adversarial attack like PGD, \ref{['genmse']} scales at the rate dictated by Corollary \ref{['asymptoticL']}, for both datasets, thus corroborating our derived generalization theory. Similarly, \ref{['cleantestmse']} and \ref{['advtestmse']} also increase, but at a reasonable rate, thus highlighting the adversarial robustness of ADMM-DAD, even under more powerful than FGSM attacks.
  • Figure 3: Robustness plots of 5-layer ADMM-DAD on CIFAR10, (left) and 10-layer ADMM-DAD on SVHN (right), for alternating overcompleteness $N$ and different attack levels $\varepsilon$ of FGSM. For both datasets, as $N$ increases, the clean test MSEs \ref{['cleantestmse']} and the adversarial test MSEs \ref{['advtestmse']} drop. Importantly, for the standard case of $\varepsilon=0.01$, the robustness gap of ADMM-DAD on both datasets is particularly small. All in all, results highlight the beneficial role that $N$ plays on robustifying ADMM-DAD against varying adversarial attack levels.
  • Figure 4: Adversarial generalization of ADMM-DAD measured in terms of the adversarial empirical generalization error \ref{['genmse']}, for alternating overcompleteness $N$ and different attack levels $\varepsilon$. For both datasets, \ref{['genmse']} increases as $N$ also increases, like Theorem \ref{['gengentheorem']} suggests, thus confirming our derived adversarial generalization theory.

Theorems & Definitions (24)

  • Definition 1: Parameter class of ADMM-DAD
  • Remark 2
  • Theorem 3: Adversarial generalization error bounds for ADMM-DAD
  • Corollary 4: Growth rate
  • Proposition 5: Bounded outputs
  • Theorem 6: Lipschitz continuity of the perturbed decoder w.r.t. parameter
  • Proposition 7: Upper-bound on covering numbers
  • Theorem 8: ARC estimate
  • Theorem B.1
  • proof
  • ...and 14 more