Table of Contents
Fetching ...

The Fourth International Verification of Neural Networks Competition (VNN-COMP 2023): Summary and Results

Christopher Brix, Stanley Bak, Changliu Liu, Taylor T. Johnson

TL;DR

VNN-COMP 2023 advances fair comparison of neural network verifiers through standardized ONNX/VNN-LIB formats and automated AWS-based evaluation. Seven teams competed on ten scored and four unscored benchmarks, highlighting a convergence toward GPU-accelerated linear bound propagation with branch-and-bound, while expanding benchmark diversity to include transformers, distribution shifts, and real-world systems. The report documents rules, participating tools, and benchmarks, analyzes results with emphasis on scoring and overhead, and discusses lessons learned and directions for future competitions. Overall, the work reinforces the value of automation and standardization in driving progress in neural network verification for safety-critical applications.

Abstract

This report summarizes the 4th International Verification of Neural Networks Competition (VNN-COMP 2023), held as a part of the 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), that was collocated with the 35th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2023 iteration, 7 teams participated on a diverse set of 10 scored and 4 unscored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.

The Fourth International Verification of Neural Networks Competition (VNN-COMP 2023): Summary and Results

TL;DR

VNN-COMP 2023 advances fair comparison of neural network verifiers through standardized ONNX/VNN-LIB formats and automated AWS-based evaluation. Seven teams competed on ten scored and four unscored benchmarks, highlighting a convergence toward GPU-accelerated linear bound propagation with branch-and-bound, while expanding benchmark diversity to include transformers, distribution shifts, and real-world systems. The report documents rules, participating tools, and benchmarks, analyzes results with emphasis on scoring and overhead, and discusses lessons learned and directions for future competitions. Overall, the work reinforces the value of automation and standardization in driving progress in neural network verification for safety-critical applications.

Abstract

This report summarizes the 4th International Verification of Neural Networks Competition (VNN-COMP 2023), held as a part of the 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), that was collocated with the 35th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2023 iteration, 7 teams participated on a diverse set of 10 scored and 4 unscored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
Paper Structure (45 sections, 33 figures, 40 tables)

This paper contains 45 sections, 33 figures, 40 tables.

Figures (33)

  • Figure 1: Accuracy Efficient Architecture for GTSRB and Belgium dataset
  • Figure 2: Accuracy Efficient Architecture for Chinese dataset
  • Figure 3: XNOR(QConv) architecture
  • Figure 4: Cactus Plot for All Instances.
  • Figure 5: Cactus Plot for All Scored Instances.
  • ...and 28 more figures