Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations

Patricia Pauli; Aaron Havens; Alexandre Araujo; Siddharth Garg; Farshad Khorrami; Frank Allgöwer; Bin Hu

Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations

Patricia Pauli, Aaron Havens, Alexandre Araujo, Siddharth Garg, Farshad Khorrami, Frank Allgöwer, Bin Hu

TL;DR

This work extends LipSDP beyond slope-restricted activations by deriving novel quadratic constraints for GroupSort, MaxMin, and Householder activations that preserve key input-output properties. Using these constraints, the authors formulate SDP conditions that yield tight $ ext{ell}_2$ and $ ext{ell}_ fty$ Lipschitz bounds for both non-residual and residual networks, as well as implicit models. The approach unifies Lipschitz analysis across a broad class of architectures and activations, and experimental results on MNIST demonstrate substantially less conservative bounds than prior methods. The framework also offers practical scalability strategies and extends to CNNs, DEQs, and neural ODEs, highlighting its potential impact on robust and certifiable learning systems.

Abstract

Recently, semidefinite programming (SDP) techniques have shown great promise in providing accurate Lipschitz bounds for neural networks. Specifically, the LipSDP approach (Fazlyab et al., 2019) has received much attention and provides the least conservative Lipschitz upper bounds that can be computed with polynomial time guarantees. However, one main restriction of LipSDP is that its formulation requires the activation functions to be slope-restricted on $[0,1]$, preventing its further use for more general activation functions such as GroupSort, MaxMin, and Householder. One can rewrite MaxMin activations for example as residual ReLU networks. However, a direct application of LipSDP to the resultant residual ReLU networks is conservative and even fails in recovering the well-known fact that the MaxMin activation is 1-Lipschitz. Our paper bridges this gap and extends LipSDP beyond slope-restricted activation functions. To this end, we provide novel quadratic constraints for GroupSort, MaxMin, and Householder activations via leveraging their underlying properties such as sum preservation. Our proposed analysis is general and provides a unified approach for estimating $\ell_2$ and $\ell_\infty$ Lipschitz bounds for a rich class of neural network architectures, including non-residual and residual neural networks and implicit models, with GroupSort, MaxMin, and Householder activations. Finally, we illustrate the utility of our approach with a variety of experiments and show that our proposed SDPs generate less conservative Lipschitz bounds in comparison to existing approaches.

Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations

TL;DR

and

Lipschitz bounds for both non-residual and residual networks, as well as implicit models. The approach unifies Lipschitz analysis across a broad class of architectures and activations, and experimental results on MNIST demonstrate substantially less conservative bounds than prior methods. The framework also offers practical scalability strategies and extends to CNNs, DEQs, and neural ODEs, highlighting its potential impact on robust and certifiable learning systems.

Abstract

, preventing its further use for more general activation functions such as GroupSort, MaxMin, and Householder. One can rewrite MaxMin activations for example as residual ReLU networks. However, a direct application of LipSDP to the resultant residual ReLU networks is conservative and even fails in recovering the well-known fact that the MaxMin activation is 1-Lipschitz. Our paper bridges this gap and extends LipSDP beyond slope-restricted activation functions. To this end, we provide novel quadratic constraints for GroupSort, MaxMin, and Householder activations via leveraging their underlying properties such as sum preservation. Our proposed analysis is general and provides a unified approach for estimating

and

Lipschitz bounds for a rich class of neural network architectures, including non-residual and residual neural networks and implicit models, with GroupSort, MaxMin, and Householder activations. Finally, we illustrate the utility of our approach with a variety of experiments and show that our proposed SDPs generate less conservative Lipschitz bounds in comparison to existing approaches.

Paper Structure (44 sections, 10 theorems, 77 equations, 4 tables)

This paper contains 44 sections, 10 theorems, 77 equations, 4 tables.

Introduction
Preliminaries
Lipschitz bounds for neural networks: A brief review
Spectral norm product bound for $\phi$ being $1$-Lipschitz.
GroupSort and Householder activations
Motivation and Problem Statement
Main Results: SDPs for GroupSort and Householder activations
Quadratic constraints for GroupSort and Householder activations
Proof sketch.
$\ell_2\rightarrow \ell_2$ Lipschitz bounds for GroupSort/Householder neural networks
Simplifications for the single-layer neural network.
$\ell_\infty\rightarrow \ell_1$ Lipschitz bounds for GroupSort/Householder neural networks
Single-layer case.
Multi-layer case.
Further generalizations: Residual networks and implicit models
...and 29 more sections

Key Result

Lemma 1

Consider a GroupSort activation $\phi:\mathbb{R}^{n}\to\mathbb{R}^{n}$ with group size $n_g$. Let $N=\frac{n}{n_g}$. Given any $\lambda \in \mathbb{R}_+^N$ and $\gamma, \nu, \tau \in \mathbb{R}^N$, the following inequality holds for all $x,y\in \mathbb{R}^n$: where $(T,S,P)$ are given as

Theorems & Definitions (14)

Lemma 1
Lemma 2
Theorem 1
Theorem 2
Theorem 3
Theorem 4
Theorem F.1
proof
Theorem F.2
proof
...and 4 more

Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations

TL;DR

Abstract

Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (14)