Novel Quadratic Constraints for Extending LipSDP beyond Slope-Restricted Activations
Patricia Pauli, Aaron Havens, Alexandre Araujo, Siddharth Garg, Farshad Khorrami, Frank Allgöwer, Bin Hu
TL;DR
This work extends LipSDP beyond slope-restricted activations by deriving novel quadratic constraints for GroupSort, MaxMin, and Householder activations that preserve key input-output properties. Using these constraints, the authors formulate SDP conditions that yield tight $ ext{ell}_2$ and $ ext{ell}_ fty$ Lipschitz bounds for both non-residual and residual networks, as well as implicit models. The approach unifies Lipschitz analysis across a broad class of architectures and activations, and experimental results on MNIST demonstrate substantially less conservative bounds than prior methods. The framework also offers practical scalability strategies and extends to CNNs, DEQs, and neural ODEs, highlighting its potential impact on robust and certifiable learning systems.
Abstract
Recently, semidefinite programming (SDP) techniques have shown great promise in providing accurate Lipschitz bounds for neural networks. Specifically, the LipSDP approach (Fazlyab et al., 2019) has received much attention and provides the least conservative Lipschitz upper bounds that can be computed with polynomial time guarantees. However, one main restriction of LipSDP is that its formulation requires the activation functions to be slope-restricted on $[0,1]$, preventing its further use for more general activation functions such as GroupSort, MaxMin, and Householder. One can rewrite MaxMin activations for example as residual ReLU networks. However, a direct application of LipSDP to the resultant residual ReLU networks is conservative and even fails in recovering the well-known fact that the MaxMin activation is 1-Lipschitz. Our paper bridges this gap and extends LipSDP beyond slope-restricted activation functions. To this end, we provide novel quadratic constraints for GroupSort, MaxMin, and Householder activations via leveraging their underlying properties such as sum preservation. Our proposed analysis is general and provides a unified approach for estimating $\ell_2$ and $\ell_\infty$ Lipschitz bounds for a rich class of neural network architectures, including non-residual and residual neural networks and implicit models, with GroupSort, MaxMin, and Householder activations. Finally, we illustrate the utility of our approach with a variety of experiments and show that our proposed SDPs generate less conservative Lipschitz bounds in comparison to existing approaches.
