Table of Contents
Fetching ...

The Fifth International Verification of Neural Networks Competition (VNN-COMP 2024): Summary and Results

Christopher Brix, Stanley Bak, Taylor T. Johnson, Haoze Wu

TL;DR

VNN-COMP 2024 advances the neural-network verification field by expanding standardized interfaces (ONNX, VNN-LIB) and automating evaluation on AWS, enabling fair comparison across eight teams and twenty benchmarks. The results indicate a continued lead of GPU-accelerated bound-propagation with branch-and-bound methods (notably α,β-CROWN) and provide in-depth analyses across diverse domains (ACAS Xu, ViT, NLP, and safety benchmarks). The report also documents runtime and scoring nuances, reports lessons learned, and outlines concrete improvements for future competitions, including better handling of tool-output formats and broader benchmark sets. Overall, the work strengthens reproducibility and scalability in NN verification while highlighting practical pathways toward industrially relevant guarantees.

Abstract

This report summarizes the 5th International Verification of Neural Networks Competition (VNN-COMP 2024), held as a part of the 7th International Symposium on AI Verification (SAIV), that was collocated with the 36th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2024 iteration, 8 teams participated on a diverse set of 12 regular and 8 extended benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.

The Fifth International Verification of Neural Networks Competition (VNN-COMP 2024): Summary and Results

TL;DR

VNN-COMP 2024 advances the neural-network verification field by expanding standardized interfaces (ONNX, VNN-LIB) and automating evaluation on AWS, enabling fair comparison across eight teams and twenty benchmarks. The results indicate a continued lead of GPU-accelerated bound-propagation with branch-and-bound methods (notably α,β-CROWN) and provide in-depth analyses across diverse domains (ACAS Xu, ViT, NLP, and safety benchmarks). The report also documents runtime and scoring nuances, reports lessons learned, and outlines concrete improvements for future competitions, including better handling of tool-output formats and broader benchmark sets. Overall, the work strengthens reproducibility and scalability in NN verification while highlighting practical pathways toward industrially relevant guarantees.

Abstract

This report summarizes the 5th International Verification of Neural Networks Competition (VNN-COMP 2024), held as a part of the 7th International Symposium on AI Verification (SAIV), that was collocated with the 36th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2024 iteration, 8 teams participated on a diverse set of 12 regular and 8 extended benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
Paper Structure (55 sections, 30 figures, 36 tables)

This paper contains 55 sections, 30 figures, 36 tables.

Figures (30)

  • Figure 1: Accuracy Efficient Architecture for GTSRB and Belgium dataset
  • Figure 2: Accuracy Efficient Architecture for Chinese dataset
  • Figure 3: XNOR(QConv) architecture
  • Figure 4: Generic approach to generating the NLP verification pipelines casadio2023antoniocasadio2024nlp deployed to obtain the safeNLP benchmark.
  • Figure 5: Cactus Plot for All Instances (Regular Track).
  • ...and 25 more figures