Table of Contents
Fetching ...

Training Safe Neural Networks with Global SDP Bounds

Roman Soletskyi, David "davidad" Dalrymple

TL;DR

This work tackles the challenge of training neural networks with formal safety guarantees over large input regions. It introduces an SDP-based verification framework and a novel ADMM-based training scheme to ensure the safety property $\max_{x\in X} f_\theta(x) \le B^{\mathrm{SDP}}(\theta)$, demonstrated on the Adversarial Spheres dataset. Empirical results show provably safe training up to input dimension $d=40$ with polynomial-time scaling, outperforming linear bounds in high dimensions, and revealing important limitations for $\ell_\infty$ safety. The approach offers a path toward safe neural policies and safe RL, with potential extensions to tighter global bounds and integration with SOS-based or reach-avoid certificates for high-dimensional systems.

Abstract

This paper presents a novel approach to training neural networks with formal safety guarantees using semidefinite programming (SDP) for verification. Our method focuses on verifying safety over large, high-dimensional input regions, addressing limitations of existing techniques that focus on adversarial robustness bounds. We introduce an ADMM-based training scheme for an accurate neural network classifier on the Adversarial Spheres dataset, achieving provably perfect recall with input dimensions up to $d=40$. This work advances the development of reliable neural network verification methods for high-dimensional systems, with potential applications in safe RL policies.

Training Safe Neural Networks with Global SDP Bounds

TL;DR

This work tackles the challenge of training neural networks with formal safety guarantees over large input regions. It introduces an SDP-based verification framework and a novel ADMM-based training scheme to ensure the safety property , demonstrated on the Adversarial Spheres dataset. Empirical results show provably safe training up to input dimension with polynomial-time scaling, outperforming linear bounds in high dimensions, and revealing important limitations for safety. The approach offers a path toward safe neural policies and safe RL, with potential extensions to tighter global bounds and integration with SOS-based or reach-avoid certificates for high-dimensional systems.

Abstract

This paper presents a novel approach to training neural networks with formal safety guarantees using semidefinite programming (SDP) for verification. Our method focuses on verifying safety over large, high-dimensional input regions, addressing limitations of existing techniques that focus on adversarial robustness bounds. We introduce an ADMM-based training scheme for an accurate neural network classifier on the Adversarial Spheres dataset, achieving provably perfect recall with input dimensions up to . This work advances the development of reliable neural network verification methods for high-dimensional systems, with potential applications in safe RL policies.
Paper Structure (20 sections, 4 theorems, 39 equations, 2 figures, 4 tables)

This paper contains 20 sections, 4 theorems, 39 equations, 2 figures, 4 tables.

Key Result

Theorem 2.1

Consider the limit as $d, h\to\infty$ while $d/h$ converges to a constant ratio. Let $f_\theta$ be a Xavier initialized glorot2010 feedforward neural network with $L$-hidden layers and the structure described in subsection subsec:setup. Then, for a given $p=2$ or $p=\infty$ and $\varepsilon$, the SD

Figures (2)

  • Figure 1: Bound gap $\Delta^{\mathrm{emp}}$ for SDP and $\alpha$-CROWN on $l_{2,\infty}$ sphere for randomly initialized or trained neural networks
  • Figure 2: Training trajectory for $d=40$, $h=120$ run

Theorems & Definitions (7)

  • Theorem 2.1
  • Theorem 2.2
  • Lemma B.1
  • Lemma B.2
  • proof
  • proof : Proof of Theorem \ref{['thm:theory:sdp-bound']}
  • proof : Proof of Theorem \ref{['thm:theory:alpha-bound']}