Table of Contents
Fetching ...

Achieving the Tightest Relaxation of Sigmoids for Formal Verification

Samuel Chevalier, Duncan Starkenburg, Krishnamurthy Dvijotham

TL;DR

The paper tackles the challenge of verifiably bounding neural networks containing sigmoid activations by developing α-sig, a differentiable, tunable affine relaxation that yields the tightest possible element-wise convex bound. It derives a closed-form β(α) relation and optimizes (α,β) in a dual space, with a backward bound propagation step to decide upper versus lower bounds per activation. A projected gradient routine, aided by a sequential quadratic approach to slope bounds, solves the dual problem efficiently, and eliminating β via β = h(α) further simplifies updates. Empirical results show α-sig can outperform state-of-the-art methods like α-CROWN and LiRPA in speed and, in many cases, bound tightness, offering a scalable path toward verification of sigmoid-containing networks and potential extensions to other S-shaped activations and softmax components.

Abstract

In the field of formal verification, Neural Networks (NNs) are typically reformulated into equivalent mathematical programs which are optimized over. To overcome the inherent non-convexity of these reformulations, convex relaxations of nonlinear activation functions are typically utilized. Common relaxations (i.e., static linear cuts) of "S-shaped" activation functions, however, can be overly loose, slowing down the overall verification process. In this paper, we derive tuneable hyperplanes which upper and lower bound the sigmoid activation function. When tuned in the dual space, these affine bounds smoothly rotate around the nonlinear manifold of the sigmoid activation function. This approach, termed $α$-sig, allows us to tractably incorporate the tightest possible, element-wise convex relaxation of the sigmoid activation function into a formal verification framework. We embed these relaxations inside of large verification tasks and compare their performance to LiRPA and $α$-CROWN, a state-of-the-art verification duo.

Achieving the Tightest Relaxation of Sigmoids for Formal Verification

TL;DR

The paper tackles the challenge of verifiably bounding neural networks containing sigmoid activations by developing α-sig, a differentiable, tunable affine relaxation that yields the tightest possible element-wise convex bound. It derives a closed-form β(α) relation and optimizes (α,β) in a dual space, with a backward bound propagation step to decide upper versus lower bounds per activation. A projected gradient routine, aided by a sequential quadratic approach to slope bounds, solves the dual problem efficiently, and eliminating β via β = h(α) further simplifies updates. Empirical results show α-sig can outperform state-of-the-art methods like α-CROWN and LiRPA in speed and, in many cases, bound tightness, offering a scalable path toward verification of sigmoid-containing networks and potential extensions to other S-shaped activations and softmax components.

Abstract

In the field of formal verification, Neural Networks (NNs) are typically reformulated into equivalent mathematical programs which are optimized over. To overcome the inherent non-convexity of these reformulations, convex relaxations of nonlinear activation functions are typically utilized. Common relaxations (i.e., static linear cuts) of "S-shaped" activation functions, however, can be overly loose, slowing down the overall verification process. In this paper, we derive tuneable hyperplanes which upper and lower bound the sigmoid activation function. When tuned in the dual space, these affine bounds smoothly rotate around the nonlinear manifold of the sigmoid activation function. This approach, termed -sig, allows us to tractably incorporate the tightest possible, element-wise convex relaxation of the sigmoid activation function into a formal verification framework. We embed these relaxations inside of large verification tasks and compare their performance to LiRPA and -CROWN, a state-of-the-art verification duo.
Paper Structure (15 sections, 31 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 31 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Convex relaxation of ReLU activation function, lower bounded by $y=\alpha x$, with tunable $\alpha\in [0,1]$.
  • Figure 2: Rotating affine bounds around the sigmoid activation function, achieving maximally tight convex relaxation.
  • Figure 3: For max and min input values ($\overline x$, $\underline x$), depicted are the steepest and shallowest upper bound slopes (${\overline \alpha}_u$, ${\underline \alpha}_u$), and the steepest and shallowest lower bound slopes (${\overline \alpha}_l$, ${\underline \alpha}_l$).
  • Figure 4: Since the slope at $\overline x$ is sufficiently steep, \ref{['eq: upper_equal']} is applied, and the tightest upper bound is a static line.
  • Figure 5: Gradient ascent iterations of $\alpha$-sig across 6 NN model sizes. Test are associated with $\tau_1$ from Table \ref{['tab:compareSolvers']}. Initial and final (i.e., optimized) CROWN bounds are also shown.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1