Scaling the Convex Barrier with Sparse Dual Algorithms
Alessandro De Palma, Harkirat Singh Behl, Rudy Bunel, Philip H. S. Torr, M. Pawan Kumar
TL;DR
This work tackles scalable neural network verification by moving beyond the common Planet relaxation to a tighter relaxation that captures cross-layer interactions via exponential constraint sets. It develops two sparse dual solvers, Active Set and Saddle Point, to solve the dual of this tighter relaxation efficiently, enabling strong lower bounds with GPU-parallelizable computations. A Big-M initialization provides a practical starting point, while a cutting-plane approach can tighten the bounds further when needed. The authors demonstrate substantial speedups and improved verification performance in both incomplete and complete settings, including branch-and-bound scenarios, and provide a scalable framework with practical implementations. Overall, the paper shows that combining tight relaxations with sparse, memory-efficient dual optimization markedly enhances neural network verification capabilities with real-world impact for robustness and safety-critical analyses.
Abstract
Tight and efficient neural network bounding is crucial to the scaling of neural network verification systems. Many efficient bounding algorithms have been presented recently, but they are often too loose to verify more challenging properties. This is due to the weakness of the employed relaxation, which is usually a linear program of size linear in the number of neurons. While a tighter linear relaxation for piecewise-linear activations exists, it comes at the cost of exponentially many constraints and currently lacks an efficient customized solver. We alleviate this deficiency by presenting two novel dual algorithms: one operates a subgradient method on a small active set of dual variables, the other exploits the sparsity of Frank-Wolfe type optimizers to incur only a linear memory cost. Both methods recover the strengths of the new relaxation: tightness and a linear separation oracle. At the same time, they share the benefits of previous dual approaches for weaker relaxations: massive parallelism, GPU implementation, low cost per iteration and valid bounds at any time. As a consequence, we can obtain better bounds than off-the-shelf solvers in only a fraction of their running time, attaining significant formal verification speed-ups.
