Training Safe Neural Networks with Global SDP Bounds

Roman Soletskyi; David "davidad" Dalrymple

Training Safe Neural Networks with Global SDP Bounds

Roman Soletskyi, David "davidad" Dalrymple

TL;DR

This work tackles the challenge of training neural networks with formal safety guarantees over large input regions. It introduces an SDP-based verification framework and a novel ADMM-based training scheme to ensure the safety property $\max_{x\in X} f_\theta(x) \le B^{\mathrm{SDP}}(\theta)$, demonstrated on the Adversarial Spheres dataset. Empirical results show provably safe training up to input dimension $d=40$ with polynomial-time scaling, outperforming linear bounds in high dimensions, and revealing important limitations for $\ell_\infty$ safety. The approach offers a path toward safe neural policies and safe RL, with potential extensions to tighter global bounds and integration with SOS-based or reach-avoid certificates for high-dimensional systems.

Abstract

This paper presents a novel approach to training neural networks with formal safety guarantees using semidefinite programming (SDP) for verification. Our method focuses on verifying safety over large, high-dimensional input regions, addressing limitations of existing techniques that focus on adversarial robustness bounds. We introduce an ADMM-based training scheme for an accurate neural network classifier on the Adversarial Spheres dataset, achieving provably perfect recall with input dimensions up to $d=40$. This work advances the development of reliable neural network verification methods for high-dimensional systems, with potential applications in safe RL policies.

Training Safe Neural Networks with Global SDP Bounds

TL;DR

, demonstrated on the Adversarial Spheres dataset. Empirical results show provably safe training up to input dimension

with polynomial-time scaling, outperforming linear bounds in high dimensions, and revealing important limitations for

safety. The approach offers a path toward safe neural policies and safe RL, with potential extensions to tighter global bounds and integration with SOS-based or reach-avoid certificates for high-dimensional systems.

Abstract

. This work advances the development of reliable neural network verification methods for high-dimensional systems, with potential applications in safe RL policies.

Paper Structure (20 sections, 4 theorems, 39 equations, 2 figures, 4 tables)

This paper contains 20 sections, 4 theorems, 39 equations, 2 figures, 4 tables.

Introduction
Preliminaries
Setup
SDP bound computation
Gap experiments
Theoretical explanation
Methods
Main task
Constrained optimization problem
ADMM scheme
Results
Discussion
Conclusion
Acknowledgements
Experiments details
...and 5 more sections

Key Result

Theorem 2.1

Consider the limit as $d, h\to\infty$ while $d/h$ converges to a constant ratio. Let $f_\theta$ be a Xavier initialized glorot2010 feedforward neural network with $L$-hidden layers and the structure described in subsection subsec:setup. Then, for a given $p=2$ or $p=\infty$ and $\varepsilon$, the SD

Figures (2)

Figure 1: Bound gap $\Delta^{\mathrm{emp}}$ for SDP and $\alpha$-CROWN on $l_{2,\infty}$ sphere for randomly initialized or trained neural networks
Figure 2: Training trajectory for $d=40$, $h=120$ run

Theorems & Definitions (7)

Theorem 2.1
Theorem 2.2
Lemma B.1
Lemma B.2
proof
proof : Proof of Theorem \ref{['thm:theory:sdp-bound']}
proof : Proof of Theorem \ref{['thm:theory:alpha-bound']}

Training Safe Neural Networks with Global SDP Bounds

TL;DR

Abstract

Training Safe Neural Networks with Global SDP Bounds

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (7)