Table of Contents
Fetching ...

Toward Certified Robustness Against Real-World Distribution Shifts

Haoze Wu, Teruhiro Tagomori, Alexander Robey, Fengjun Yang, Nikolai Matni, George Pappas, Hamed Hassani, Corina Pasareanu, Clark Barrett

TL;DR

This work tackles certified robustness against real-world distribution shifts by integrating deep generative models that capture perturbations with neural-symbolic verification. It introduces a CEGAR-based framework to verify networks containing S-shaped activations, addressing the limitations of one-shot abstractions for sigmoid outputs. The approach demonstrates improved certifiable robustness on MNIST and CIFAR-10, and analyzes how hyperparameters and training strategies influence verification outcomes. The methodology offers a practical pathway to meaningful guarantees under realistic shifts, while recognizing scalability and generative-model fidelity as areas for future work.

Abstract

We consider the problem of certifying the robustness of deep neural networks against real-world distribution shifts. To do so, we bridge the gap between hand-crafted specifications and realistic deployment settings by proposing a novel neural-symbolic verification framework, in which we train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model. A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations, which are fundamental to many state-of-the-art generative models. To address this challenge, we propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement. The key idea is to "lazily" refine the abstraction of sigmoid functions to exclude spurious counter-examples found in the previous abstraction, thus guaranteeing progress in the verification process while keeping the state-space small. Experiments on the MNIST and CIFAR-10 datasets show that our framework significantly outperforms existing methods on a range of challenging distribution shifts.

Toward Certified Robustness Against Real-World Distribution Shifts

TL;DR

This work tackles certified robustness against real-world distribution shifts by integrating deep generative models that capture perturbations with neural-symbolic verification. It introduces a CEGAR-based framework to verify networks containing S-shaped activations, addressing the limitations of one-shot abstractions for sigmoid outputs. The approach demonstrates improved certifiable robustness on MNIST and CIFAR-10, and analyzes how hyperparameters and training strategies influence verification outcomes. The methodology offers a practical pathway to meaningful guarantees under realistic shifts, while recognizing scalability and generative-model fidelity as areas for future work.

Abstract

We consider the problem of certifying the robustness of deep neural networks against real-world distribution shifts. To do so, we bridge the gap between hand-crafted specifications and realistic deployment settings by proposing a novel neural-symbolic verification framework, in which we train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model. A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations, which are fundamental to many state-of-the-art generative models. To address this challenge, we propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement. The key idea is to "lazily" refine the abstraction of sigmoid functions to exclude spurious counter-examples found in the previous abstraction, thus guaranteeing progress in the verification process while keeping the state-space small. Experiments on the MNIST and CIFAR-10 datasets show that our framework significantly outperforms existing methods on a range of challenging distribution shifts.
Paper Structure (25 sections, 7 theorems, 15 equations, 5 figures, 21 tables, 2 algorithms)

This paper contains 25 sections, 7 theorems, 15 equations, 5 figures, 21 tables, 2 algorithms.

Key Result

Theorem 4.2

Algorithm alg:cegar returns true only if $M \models \Phi$.

Figures (5)

  • Figure 1: Perturbation sets. We illustrate two examples of perturbation sets $\mathcal{S}(x)$.
  • Figure 2: Samples from learned perturbation sets. We show samples from two learned perturbation sets $\mathcal{S}(x)$ on the MNIST and CIFAR-10 datasets. Samples were generated by gridding the latent space of $\mathcal{S}(x)$.
  • Figure 3: An abstraction of the sigmoid activation function.
  • Figure 4: Robustness of ERM against different perturbations.
  • Figure 5: Test set accuracy and verification accuracy

Theorems & Definitions (16)

  • Definition 3.1: Inflection point
  • Definition 3.2: S-shaped function
  • Definition 4.1: Sound abstraction
  • Theorem 4.2: CEGAR is sound
  • Lemma 4.3
  • Lemma 4.4: Soundness of slopes
  • Theorem 4.5: Soundness of refinement
  • proof
  • proof
  • Definition B.1: Tagent line
  • ...and 6 more