Table of Contents
Fetching ...

Tight Robustness Certificates and Wasserstein Distributional Attacks for Deep Neural Networks

Bach C. Le, Tung V. Dao, Binh T. Nguyen, Hong T. M. Chu

TL;DR

This paper advances robustness evaluation and certification for deep neural networks by leveraging the local geometry of network activations. It derives tractable, tight WDRO bounds for ReLU and smooth-activation networks through activation-cell masks and Jacobian analysis, providing exact or near-exact local Lipschitz certificates. It then introduces the Wasserstein Distributional Attack (WDA), which constructs distributional adversaries on 2N points by perturbing along margin-aligned directions with a tunable parameter $\kappa$, yielding stronger attacks than point-wise methods and closely matched ensemble baselines. Empirical results on CIFAR-10/100 show that WDA can tighten robustness certificates and reveal greater vulnerability than traditional evaluations, underscoring the value of a distributional perspective in robustness research; the authors also release code for replication.

Abstract

Wasserstein distributionally robust optimization (WDRO) provides a framework for adversarial robustness, yet existing methods based on global Lipschitz continuity or strong duality often yield loose upper bounds or require prohibitive computation. In this work, we address these limitations by introducing a primal approach and adopting a notion of exact Lipschitz certificate to tighten this upper bound of WDRO. In addition, we propose a novel Wasserstein distributional attack (WDA) that directly constructs a candidate for the worst-case distribution. Compared to existing point-wise attack and its variants, our WDA offers greater flexibility in the number and location of attack points. In particular, by leveraging the piecewise-affine structure of ReLU networks on their activation cells, our approach results in an exact tractable characterization of the corresponding WDRO problem. Extensive evaluations demonstrate that our method achieves competitive robust accuracy against state-of-the-art baselines while offering tighter certificates than existing methods. Our code is available at https://github.com/OLab-Repo/WDA

Tight Robustness Certificates and Wasserstein Distributional Attacks for Deep Neural Networks

TL;DR

This paper advances robustness evaluation and certification for deep neural networks by leveraging the local geometry of network activations. It derives tractable, tight WDRO bounds for ReLU and smooth-activation networks through activation-cell masks and Jacobian analysis, providing exact or near-exact local Lipschitz certificates. It then introduces the Wasserstein Distributional Attack (WDA), which constructs distributional adversaries on 2N points by perturbing along margin-aligned directions with a tunable parameter , yielding stronger attacks than point-wise methods and closely matched ensemble baselines. Empirical results on CIFAR-10/100 show that WDA can tighten robustness certificates and reveal greater vulnerability than traditional evaluations, underscoring the value of a distributional perspective in robustness research; the authors also release code for replication.

Abstract

Wasserstein distributionally robust optimization (WDRO) provides a framework for adversarial robustness, yet existing methods based on global Lipschitz continuity or strong duality often yield loose upper bounds or require prohibitive computation. In this work, we address these limitations by introducing a primal approach and adopting a notion of exact Lipschitz certificate to tighten this upper bound of WDRO. In addition, we propose a novel Wasserstein distributional attack (WDA) that directly constructs a candidate for the worst-case distribution. Compared to existing point-wise attack and its variants, our WDA offers greater flexibility in the number and location of attack points. In particular, by leveraging the piecewise-affine structure of ReLU networks on their activation cells, our approach results in an exact tractable characterization of the corresponding WDRO problem. Extensive evaluations demonstrate that our method achieves competitive robust accuracy against state-of-the-art baselines while offering tighter certificates than existing methods. Our code is available at https://github.com/OLab-Repo/WDA

Paper Structure

This paper contains 30 sections, 4 theorems, 44 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Theorem 3.1

Given a ReLU network $\theta(x) = W_{H+1}( \operatorname{ReLU}(\cdots (W_1 x + b_1)\cdots )+b_{H})$ being in general position, $1/r+1/s=1$ and $\ell$ being the cross-entropy or DLR loss, define and where $J_{\bm{D}}$, $\mathcal{C}_{\bm{D}}$, $\mathcal{D}_{\mathcal{X}}$ are defined in Definition def:mask and $\mathrm{rec}({\mathcal{C}_{\bm{D}}})$ is the recession cone of $\mathcal{C}_{\bm{D}}$.

Figures (8)

  • Figure 1: Left: Wasserstein ambiguity ball $\Omega_{p} = \left\{ \mathbb{P} \colon \mathcal{W}_{d,p}(\mathbb{P},\mathbb{P}_N) \leq \epsilon \right\}$ inclusion and its admissible attacks. Our proposed Wasserstein Distributional Attack (WDA) with $\kappa\geq1$ includes its special case $\kappa=1$ as a point-wise attack, and produces a distributional attack when $\kappa>1$. Note that most of the existing tight certificates estimated an upper bound of WDRO w.r.t. $\Omega_{p=1}$, not $\Omega_{p=\infty}$. Right: Visualization of point-wise attack ($N$ adversarial samples) versus our WDA ($2N$ adversarial samples). Our WDA allows not only a larger number of supports but also a wider range of perturbations.
  • Figure 2: WDRO bounds and PGD attack loss for a fixed $n=K=2$ ReLU classifier with one hidden layer of dimension 8. Lower-bound curves are the cumulative $\bm{l}$ as more reachable activation masks are considered.
  • Figure 3: Wasserstein Distributional Attack (WDA, Alg. \ref{['alg:WDA']}) for $r=2$. At each iteration $x_t$, WDA forms $K\!-\!1$ candidates $\varphi_j$ and updates using the one with the largest logit $\theta_j(\varphi_j)$. For reference, PGD follows the dual-norm gradient direction; DeepFool linearizes the decision boundary.
  • Figure 4: Robust accuracy with varying $\kappa$ on different defense methods.
  • Figure 5: Convergence of wang2023better under $\ell_{2}$ perturbations ($\epsilon = 0.5$).
  • ...and 3 more figures

Theorems & Definitions (9)

  • Remark
  • Definition 3.1: Mask and Cell
  • Theorem 3.1: WDRO for ReLU
  • proof
  • Corollary 3.2: Practical lower bound
  • Theorem 3.3: WDRO for Smooth Networks
  • proof
  • Lemma A.1: Technical lemma
  • proof