Table of Contents
Fetching ...

Holding Secrets Accountable: Auditing Privacy-Preserving Machine Learning

Hidde Lycklama, Alexander Viand, Nicolas Küchler, Christian Knabenhans, Anwar Hithnawi

TL;DR

Arc tackles auditing in privacy-preserving ML by introducing a cryptographic framework that binds training data, models, and predictions through concise receipts, enabling post-hoc audits without exposing private inputs. It defines an ideal functionality $\,\mathcal{F}_{Arc}$ and installs a Proof-of-Consistency (PoC) based on KZG polynomial commitments to verify input consistency under MPC, with identifiable aborts to ensure trust. Empirically, Arc outperforms hashing-based and homomorphic-commitment baselines by up to $10^4\times$ in runtime and up to $10^6\times$ in receipt conciseness across diverse models (from Logistic Regression to BERT) and datasets, while maintaining modest storage overhead. The auditing functions span robustness & fairness, accountability, and explainability, enabling scalable, end-to-end private audits in mixed plaintext/secure deployments and are released as open-source components for practical PPML deployments.

Abstract

Recent advancements in privacy-preserving machine learning are paving the way to extend the benefits of ML to highly sensitive data that, until now, have been hard to utilize due to privacy concerns and regulatory constraints. Simultaneously, there is a growing emphasis on enhancing the transparency and accountability of machine learning, including the ability to audit ML deployments. While ML auditing and PPML have both been the subjects of intensive research, they have predominately been examined in isolation. However, their combination is becoming increasingly important. In this work, we introduce Arc, an MPC framework for auditing privacy-preserving machine learning. At the core of our framework is a new protocol for efficiently verifying MPC inputs against succinct commitments at scale. We evaluate the performance of our framework when instantiated with our consistency protocol and compare it to hashing-based and homomorphic-commitment-based approaches, demonstrating that it is up to 10^4x faster and up to 10^6x more concise.

Holding Secrets Accountable: Auditing Privacy-Preserving Machine Learning

TL;DR

Arc tackles auditing in privacy-preserving ML by introducing a cryptographic framework that binds training data, models, and predictions through concise receipts, enabling post-hoc audits without exposing private inputs. It defines an ideal functionality and installs a Proof-of-Consistency (PoC) based on KZG polynomial commitments to verify input consistency under MPC, with identifiable aborts to ensure trust. Empirically, Arc outperforms hashing-based and homomorphic-commitment baselines by up to in runtime and up to in receipt conciseness across diverse models (from Logistic Regression to BERT) and datasets, while maintaining modest storage overhead. The auditing functions span robustness & fairness, accountability, and explainability, enabling scalable, end-to-end private audits in mixed plaintext/secure deployments and are released as open-source components for practical PPML deployments.

Abstract

Recent advancements in privacy-preserving machine learning are paving the way to extend the benefits of ML to highly sensitive data that, until now, have been hard to utilize due to privacy concerns and regulatory constraints. Simultaneously, there is a growing emphasis on enhancing the transparency and accountability of machine learning, including the ability to audit ML deployments. While ML auditing and PPML have both been the subjects of intensive research, they have predominately been examined in isolation. However, their combination is becoming increasingly important. In this work, we introduce Arc, an MPC framework for auditing privacy-preserving machine learning. At the core of our framework is a new protocol for efficiently verifying MPC inputs against succinct commitments at scale. We evaluate the performance of our framework when instantiated with our consistency protocol and compare it to hashing-based and homomorphic-commitment-based approaches, demonstrating that it is up to 10^4x faster and up to 10^6x more concise.
Paper Structure (32 sections, 2 theorems, 8 equations, 7 figures, 4 tables, 6 algorithms)

This paper contains 32 sections, 2 theorems, 8 equations, 7 figures, 4 tables, 6 algorithms.

Key Result

Theorem B.1

Given a set of ${N_{\texttt{DH}\xspace}}\xspace$$\texttt{DH}\xspace$, a set of ${N_{\texttt{M}\xspace}}\xspace$$\texttt{M}\xspace$, a set of ${N_{\texttt{C}\xspace}}\xspace$$\texttt{C}\xspace$, a set of ${N_{\texttt{TC}\xspace}}\xspace$$\texttt{TC}\xspace$, a set of ${N_{\texttt{IC}\xspace}}\xspace$ are computationally indistinguishable, where $\mathbf{D}\xspace$ is a list of training datasets for

Figures (7)

  • Figure 1: Overview of Arc, which augments existing ppml pipelines with an mpc auditing phase to execute auditing functions. receive a receipt that can later be used to verify the consistency of the training data, model and prediction under audit.
  • Figure 2: Arc's Ideal Functionality.
  • Figure 3: Evaluation of Arc comparing the approaches relative to a single epoch of training.
  • Figure 4: The overhead of our system's consistency protocol relative to a single inference for our three scenarios.
  • Figure 5: The overhead of Arc's consistency layer relative to the cost of the auditing function computation in for four different auditing functions across our three scenarios.
  • ...and 2 more figures

Theorems & Definitions (12)

  • Definition 4.1: poc Protocol
  • Definition A.1: Commitment Scheme
  • Definition A.2: Polynomial Commitments Kate2010-px
  • Definition A.3: KZG Commitments Kate2010-px
  • Definition A.4: Homomorphic Commitment Scheme Bunz2018-mg
  • Definition A.5: Digital Signature Scheme
  • Definition A.6: Proof-of-Training
  • Definition A.7: Proof-of-Inference
  • Theorem B.1: Collaborative Auditing
  • proof
  • ...and 2 more