Trustless Audits without Revealing Data or Models

Suppakit Waiwitlikhit; Ion Stoica; Yi Sun; Tatsunori Hashimoto; Daniel Kang

Trustless Audits without Revealing Data or Models

Suppakit Waiwitlikhit, Ion Stoica, Yi Sun, Tatsunori Hashimoto, Daniel Kang

TL;DR

Trustless audits for ML training are possible without revealing data or weights by publishing commitments and zero-knowledge proofs that training occurred on the committed data. ZkAudit-T proves SGD steps from hidden data/weights, and ZkAudit-I enables arbitrary audits by proving $F( ext{data},\text{weights})$ with a ZK-SNARK, while maintaining privacy. The approach extends ZK-SNARKs to gradient descent via rounded division and fixed-point arithmetic and introduces a high-performance softmax to achieve competitive accuracy on ImageNet-scale tasks. Empirical results show feasible proving costs and near FP32 accuracy on image classification and recommender systems, with practical audit costs for censorship, counterfactual, and copyright-related checks. The work demonstrates a viable path to privacy-preserving, trustless ML audits with broad potential impact and clear directions for scaling to larger models and language tasks.

Abstract

There is an increasing conflict between business incentives to hide models and data as trade secrets, and the societal need for algorithmic transparency. For example, a rightsholder wishing to know whether their copyrighted works have been used during training must convince the model provider to allow a third party to audit the model and data. Finding a mutually agreeable third party is difficult, and the associated costs often make this approach impractical. In this work, we show that it is possible to simultaneously allow model providers to keep their model weights (but not architecture) and data secret while allowing other parties to trustlessly audit model and data properties. We do this by designing a protocol called ZkAudit in which model providers publish cryptographic commitments of datasets and model weights, alongside a zero-knowledge proof (ZKP) certifying that published commitments are derived from training the model. Model providers can then respond to audit requests by privately computing any function F of the dataset (or model) and releasing the output of F alongside another ZKP certifying the correct execution of F. To enable ZkAudit, we develop new methods of computing ZKPs for SGD on modern neural nets for simple recommender systems and image classification models capable of high accuracies on ImageNet. Empirically, we show it is possible to provide trustless audits of DNNs, including copyright, censorship, and counterfactual audits with little to no loss in accuracy.

Trustless Audits without Revealing Data or Models

TL;DR

with a ZK-SNARK, while maintaining privacy. The approach extends ZK-SNARKs to gradient descent via rounded division and fixed-point arithmetic and introduces a high-performance softmax to achieve competitive accuracy on ImageNet-scale tasks. Empirical results show feasible proving costs and near FP32 accuracy on image classification and recommender systems, with practical audit costs for censorship, counterfactual, and copyright-related checks. The work demonstrates a viable path to privacy-preserving, trustless ML audits with broad potential impact and clear directions for scaling to larger models and language tasks.

Abstract

Paper Structure (25 sections, 14 equations, 4 figures, 5 tables)

This paper contains 25 sections, 14 equations, 4 figures, 5 tables.

Introduction
Background on ZK-SNARKs
ZkAudit: Private Audits of ML
Computing ZK-SNARKs for Gradient Descent
Evaluation of ZkAudit-T
Performance of SGD
End-to-End Accuracy and Costs
Effects of Optimizations
Using ZkAudit for Audits
Related Work
Conclusion
Impact Statement
ZK-SNARKs
Intuition
Expressing Functions
...and 10 more sections

Figures (4)

Figure 1: Test accuracy vs cost of proving training across the entire dataset for the Pareto frontier of image classification. Higher is better. The dashed line is the fp32 accuracy.
Figure 2: Test MSE vs total training cost for the Pareto frontier for the recommender system. Lower is better.
Figure 3: Test MSE vs scale factor. ZkAudit-T achieves parity with fp32 at $2^{13}$.
Figure 4: Test accuracy vs scale factor. As shown, we can achieve within 0.7% accuracy compared to full precision with a scale factor of $2^{15}$. The accuracy degrades with lower scale factors.

Trustless Audits without Revealing Data or Models

TL;DR

Abstract

Trustless Audits without Revealing Data or Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)