Verifying Properties of Binary Neural Networks Using Sparse Polynomial Optimization
Jianting Yang, Srećko Ðurašinović, Jean-Bernard Lasserre, Victor Magron, Jun Zhao
TL;DR
This work targets robustness verification for Binary Neural Networks (BNNs) under adversarial perturbations with continuous inputs. It encodes the BNN verification problem as a sparse polynomial optimization problem leveraging the semi-algebraic sign activation and solves the first-order SDP relaxation to obtain lower bounds, achieving tighter certificates by adding tautologies to enlarge the first-order quadratic module. The approach yields up to 55% tighter bounds than linear relaxations and delivers substantial speedups (e.g., ~4.5x for $\|\cdot\|_\infty$ and ~11.4x for $\|\cdot\|_2$) while handling both $\|\cdot\|_\infty$ and $\|\cdot\|_2$ perturbations in continuous input spaces. These results indicate scalable, rigorous verification for larger BNNs and motivate integrating SDP relaxations into MILP/MINP-based verification pipelines for enhanced performance.
Abstract
This paper explores methods for verifying the properties of Binary Neural Networks (BNNs), focusing on robustness against adversarial attacks. Despite their lower computational and memory needs, BNNs, like their full-precision counterparts, are also sensitive to input perturbations. Established methods for solving this problem are predominantly based on Satisfiability Modulo Theories and Mixed-Integer Linear Programming techniques, which are characterized by NP complexity and often face scalability issues. We introduce an alternative approach using Semidefinite Programming relaxations derived from sparse Polynomial Optimization. Our approach, compatible with continuous input space, not only mitigates numerical issues associated with floating-point calculations but also enhances verification scalability through the strategic use of tighter first-order semidefinite relaxations. We demonstrate the effectiveness of our method in verifying robustness against both $\|.\|_\infty$ and $\|.\|_2$-based adversarial attacks.
