Tight Verification of Probabilistic Robustness in Bayesian Neural Networks
Ben Batten, Mehran Hosseini, Alessio Lomuscio
TL;DR
This work tackles the challenge of verifying probabilistic robustness in Bayesian Neural Networks, where weight uncertainty makes exact certification intractable. It introduces two iterative, dynamic-region expansion methods—Pure Dynamic Scaling (PIE) and Gradient-Guided Dynamic Scaling (GIE)—to compute provably tighter lower bounds than state-of-the-art sampling approaches. Empirical results on MNIST and CIFAR-10 show substantial improvements in probabilistic certification, with bounds often doubling those of prior methods; gradient guidance generally yields the strongest guarantees. The methods are designed to be compatible with any BNN verification backend, enabling tighter safety assurances for probabilistic models in safety-critical contexts while highlighting practical considerations such as computation time and iteration depth.
Abstract
We introduce two algorithms for computing tight guarantees on the probabilistic robustness of Bayesian Neural Networks (BNNs). Computing robustness guarantees for BNNs is a significantly more challenging task than verifying the robustness of standard Neural Networks (NNs) because it requires searching the parameters' space for safe weights. Moreover, tight and complete approaches for the verification of standard NNs, such as those based on Mixed-Integer Linear Programming (MILP), cannot be directly used for the verification of BNNs because of the polynomial terms resulting from the consecutive multiplication of variables encoding the weights. Our algorithms efficiently and effectively search the parameters' space for safe weights by using iterative expansion and the network's gradient and can be used with any verification algorithm of choice for BNNs. In addition to proving that our algorithms compute tighter bounds than the SoA, we also evaluate our algorithms against the SoA on standard benchmarks, such as MNIST and CIFAR10, showing that our algorithms compute bounds up to 40% tighter than the SoA.
