Table of Contents
Fetching ...

Verifiable evaluations of machine learning models using zkSNARKs

Tobin South, Alexander Camuto, Shrey Jain, Shayla Nguyen, Robert Mahari, Christian Paquin, Jason Morton, Alex 'Sandy' Pentland

TL;DR

The paper addresses the problem of verifying performance claims for closed-weight ML systems, where end users cannot reproducibly confirm benchmarks. It introduces a verifiable evaluation framework based on zkSNARKs that proves inference over datasets without revealing model weights, using a four-step ONNX-to-proof workflow and a 'predict, then prove' strategy to separate inference from proof generation. The contributions include a generalizable, end-to-end attestation framework, a flexible proving system (mapping ONNX to proof circuits via Halo2 and the ezkl toolkit), and a cost-aware discussion with demonstrations across diverse architectures, along with privacy-preserving aggregation and challenge-based audits. This approach enables transparent, auditable benchmarking and model integrity in high-stakes or private-weight contexts, while maintaining weight confidentiality and scalability through modular proofs and attestations.

Abstract

In a world of increasing closed-source commercial machine learning models, model evaluations from developers must be taken at face value. These benchmark results-whether over task accuracy, bias evaluations, or safety checks-are traditionally impossible to verify by a model end-user without the costly or impossible process of re-performing the benchmark on black-box model outputs. This work presents a method of verifiable model evaluation using model inference through zkSNARKs. The resulting zero-knowledge computational proofs of model outputs over datasets can be packaged into verifiable evaluation attestations showing that models with fixed private weights achieve stated performance or fairness metrics over public inputs. We present a flexible proving system that enables verifiable attestations to be performed on any standard neural network model with varying compute requirements. For the first time, we demonstrate this across a sample of real-world models and highlight key challenges and design solutions. This presents a new transparency paradigm in the verifiable evaluation of private models.

Verifiable evaluations of machine learning models using zkSNARKs

TL;DR

The paper addresses the problem of verifying performance claims for closed-weight ML systems, where end users cannot reproducibly confirm benchmarks. It introduces a verifiable evaluation framework based on zkSNARKs that proves inference over datasets without revealing model weights, using a four-step ONNX-to-proof workflow and a 'predict, then prove' strategy to separate inference from proof generation. The contributions include a generalizable, end-to-end attestation framework, a flexible proving system (mapping ONNX to proof circuits via Halo2 and the ezkl toolkit), and a cost-aware discussion with demonstrations across diverse architectures, along with privacy-preserving aggregation and challenge-based audits. This approach enables transparent, auditable benchmarking and model integrity in high-stakes or private-weight contexts, while maintaining weight confidentiality and scalability through modular proofs and attestations.

Abstract

In a world of increasing closed-source commercial machine learning models, model evaluations from developers must be taken at face value. These benchmark results-whether over task accuracy, bias evaluations, or safety checks-are traditionally impossible to verify by a model end-user without the costly or impossible process of re-performing the benchmark on black-box model outputs. This work presents a method of verifiable model evaluation using model inference through zkSNARKs. The resulting zero-knowledge computational proofs of model outputs over datasets can be packaged into verifiable evaluation attestations showing that models with fixed private weights achieve stated performance or fairness metrics over public inputs. We present a flexible proving system that enables verifiable attestations to be performed on any standard neural network model with varying compute requirements. For the first time, we demonstrate this across a sample of real-world models and highlight key challenges and design solutions. This presents a new transparency paradigm in the verifiable evaluation of private models.
Paper Structure (36 sections, 3 equations, 3 figures, 1 table)

This paper contains 36 sections, 3 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: A high level overview of the motivations and system design, which is augmented by the flexible ezkl proving system that can handle any ML model.
  • Figure 2: System diagram of verifiable ML evaluation using the zkSNARK ezkl toolkit. A model can be compiled into a proving key ($pk$) and verification key ($vk$) which can be used to generate repeated inference proofs over a dataset ($\pi$), which can then be aggregated into a verifiable evaluation attestation. Using the same proving and verification keys, any future inference of a model can be checked to confirm a model with the same model weight hash, $H(W)$, was used to generate the output. Inference data can be arbitrary.
  • Figure 3: Time and RAM requirements for model proofs with increasing model sizes across multi-layered perceptions (MLP), convolutional neural networks (CNN), and attention-based transformers (Attn). Model requirements scale linearly with the number of constraints, driven by the number of operations used in a model inference.