Tight Robustness Certificates and Wasserstein Distributional Attacks for Deep Neural Networks
Bach C. Le, Tung V. Dao, Binh T. Nguyen, Hong T. M. Chu
TL;DR
This paper advances robustness evaluation and certification for deep neural networks by leveraging the local geometry of network activations. It derives tractable, tight WDRO bounds for ReLU and smooth-activation networks through activation-cell masks and Jacobian analysis, providing exact or near-exact local Lipschitz certificates. It then introduces the Wasserstein Distributional Attack (WDA), which constructs distributional adversaries on 2N points by perturbing along margin-aligned directions with a tunable parameter $\kappa$, yielding stronger attacks than point-wise methods and closely matched ensemble baselines. Empirical results on CIFAR-10/100 show that WDA can tighten robustness certificates and reveal greater vulnerability than traditional evaluations, underscoring the value of a distributional perspective in robustness research; the authors also release code for replication.
Abstract
Wasserstein distributionally robust optimization (WDRO) provides a framework for adversarial robustness, yet existing methods based on global Lipschitz continuity or strong duality often yield loose upper bounds or require prohibitive computation. In this work, we address these limitations by introducing a primal approach and adopting a notion of exact Lipschitz certificate to tighten this upper bound of WDRO. In addition, we propose a novel Wasserstein distributional attack (WDA) that directly constructs a candidate for the worst-case distribution. Compared to existing point-wise attack and its variants, our WDA offers greater flexibility in the number and location of attack points. In particular, by leveraging the piecewise-affine structure of ReLU networks on their activation cells, our approach results in an exact tractable characterization of the corresponding WDRO problem. Extensive evaluations demonstrate that our method achieves competitive robust accuracy against state-of-the-art baselines while offering tighter certificates than existing methods. Our code is available at https://github.com/OLab-Repo/WDA
