Table of Contents
Fetching ...

SoundnessBench: A Soundness Benchmark for Neural Network Verifiers

Xingjian Zhou, Keyi Shen, Andy Xu, Hongji Xu, Cho-Jui Hsieh, Huan Zhang, Zhouxing Shi

TL;DR

SoundnessBench introduces a ground-truth-focused benchmark for neural network verifiers by embedding hidden counterexamples, enabling robust evaluation of verifier soundness. It employs a two-objective training scheme—margin-based pre-defined counterexamples and adversarial training with a perturbation sliding window—to produce unverifiable instances while maintaining broad robustness on regular inputs. The benchmark spans 26 models across CNN, MLP, and ViT architectures and reveals real bugs in several state-of-the-art verifiers, including $ ext{ABC}$-CROWN, NeuralSAT, and Marabou 2023, with fixes observed in downstream tooling. While limited by small-scale data and existing attack methods, SoundnessBench provides a valuable framework for stress-testing and improving the reliability of NN verification in safety-critical domains.

Abstract

Neural network (NN) verification aims to formally verify properties of NNs, which is crucial for ensuring the behavior of NN-based models in safety-critical applications. In recent years, the community has developed many NN verifiers and benchmarks to evaluate them. However, existing benchmarks typically lack ground-truth for hard instances where no current verifier can verify the property and no counterexample can be found. This makes it difficult to validate the soundness of a verifier, when it claims verification on such challenging instances that no other verifier can handle. In this work, we develop a new benchmark for NN verification, named SoundnessBench, specifically for testing the soundness of NN verifiers. SoundnessBench consists of instances with deliberately inserted counterexamples that are hidden from adversarial attacks commonly used to find counterexamples. Thereby, it can identify false verification claims when hidden counterexamples are known to exist. We design a training method to produce NNs with hidden counterexamples and systematically construct our SoundnessBench with instances across various model architectures, activation functions, and input data. We demonstrate that our training effectively produces hidden counterexamples and our SoundnessBench successfully identifies bugs in state-of-the-art NN verifiers. Our code is available at https://github.com/mvp-harry/SoundnessBench and our dataset is available at https://huggingface.co/datasets/SoundnessBench/SoundnessBench.

SoundnessBench: A Soundness Benchmark for Neural Network Verifiers

TL;DR

SoundnessBench introduces a ground-truth-focused benchmark for neural network verifiers by embedding hidden counterexamples, enabling robust evaluation of verifier soundness. It employs a two-objective training scheme—margin-based pre-defined counterexamples and adversarial training with a perturbation sliding window—to produce unverifiable instances while maintaining broad robustness on regular inputs. The benchmark spans 26 models across CNN, MLP, and ViT architectures and reveals real bugs in several state-of-the-art verifiers, including -CROWN, NeuralSAT, and Marabou 2023, with fixes observed in downstream tooling. While limited by small-scale data and existing attack methods, SoundnessBench provides a valuable framework for stress-testing and improving the reliability of NN verification in safety-critical domains.

Abstract

Neural network (NN) verification aims to formally verify properties of NNs, which is crucial for ensuring the behavior of NN-based models in safety-critical applications. In recent years, the community has developed many NN verifiers and benchmarks to evaluate them. However, existing benchmarks typically lack ground-truth for hard instances where no current verifier can verify the property and no counterexample can be found. This makes it difficult to validate the soundness of a verifier, when it claims verification on such challenging instances that no other verifier can handle. In this work, we develop a new benchmark for NN verification, named SoundnessBench, specifically for testing the soundness of NN verifiers. SoundnessBench consists of instances with deliberately inserted counterexamples that are hidden from adversarial attacks commonly used to find counterexamples. Thereby, it can identify false verification claims when hidden counterexamples are known to exist. We design a training method to produce NNs with hidden counterexamples and systematically construct our SoundnessBench with instances across various model architectures, activation functions, and input data. We demonstrate that our training effectively produces hidden counterexamples and our SoundnessBench successfully identifies bugs in state-of-the-art NN verifiers. Our code is available at https://github.com/mvp-harry/SoundnessBench and our dataset is available at https://huggingface.co/datasets/SoundnessBench/SoundnessBench.

Paper Structure

This paper contains 44 sections, 12 equations, 1 figure, 15 tables, 1 algorithm.

Figures (1)

  • Figure 1: A high-level overview of our SoundnessBench and its purpose. NN models are trained on a synthetic dataset using the two-objective training framework described in \ref{['sec:training']}, to produce both unverifiable and regular instances. Unverifiable instances are then used to test the soundness of various NN verifiers, while regular instances are also added to prevent developers of NN verifiers from directly knowing all the instances are unverifiable. A bug is detected if an NN verifier claims a successful verification for any unverifiable instance.