Secure and Confidential Certificates of Online Fairness

Olive Franzese; Ali Shahin Shamsabadi; Carter Luck; Hamed Haddadi

Secure and Confidential Certificates of Online Fairness

Olive Franzese, Ali Shahin Shamsabadi, Carter Luck, Hamed Haddadi

TL;DR

This work tackles verifying ML fairness under confidentiality by introducing OATH, a privacy-preserving online group-fairness certificate. It uses a two-phase protocol with commitments and zero-knowledge proofs, plus a cut-and-choose audit to achieve scalability for large deployment data while protecting client and model confidentiality. Theoretical guarantees show exponential reduction in cheating risk as the audited sample grows, and empirical results across diverse datasets and model types demonstrate practical runtimes and substantial improvements over baselines. The approach enables reliable, scalable, and confidential auditing of fairness in deployed ML services, with clear paths for extending to other online properties and metrics.

Abstract

The black-box service model enables ML service providers to serve clients while keeping their intellectual property and client data confidential. Confidentiality is critical for delivering ML services legally and responsibly, but makes it difficult for outside parties to verify important model properties such as fairness. Existing methods that assess model fairness confidentially lack either (i) reliability because they certify fairness with respect to a static set of data, and therefore fail to guarantee fairness in the presence of distribution shift or service provider malfeasance; and/or (ii) scalability due to the computational overhead of confidentiality-preserving cryptographic primitives. We address these problems by introducing online fairness certificates, which verify that a model is fair with respect to data received by the service provider online during deployment. We then present OATH, a deployably efficient and scalable zero-knowledge proof protocol for confidential online group fairness certification. OATH exploits statistical properties of group fairness via a cut-and-choose style protocol, enabling scalability improvements over baselines.

Secure and Confidential Certificates of Online Fairness

TL;DR

Abstract

Paper Structure (30 sections, 4 theorems, 15 equations, 6 figures, 4 tables, 7 algorithms)

This paper contains 30 sections, 4 theorems, 15 equations, 6 figures, 4 tables, 7 algorithms.

Introduction
Background, Preliminaries & Related Work
Problem Formulation
OATH
Service Phase
Audit Phase
Analysis of Probabilistic Audit
Empirical Evaluation
Efficiency of Certifying Standard Fairness Benchmarks
Parametrization of Probabilistic Audit
Scalability for Neural Networks
Conclusion
Comparison of fairness auditing approaches
Certifying Other Group Fairness Metrics
Details of Cryptographic Preliminaries
...and 15 more sections

Key Result

Theorem 4.1

Let $X_h$ be the honest group fairness gap and $X_m$ the measured group fairness gap computed during Algorithm alg:audit. Consider $\epsilon > 0$, the deviation between these quantities caused by a malicious $\mathcal{P}$ cheating, defined: $\epsilon = \left| X_h - X_m \right|.$ Then $\mathcal{P}$ i where $\nu$ is the number of queries uniformly sampled within each group via Algorithm alg:balanced

Figures (6)

Figure 1: Limitations of Offline Fairness Certificates (left) and Overview of OATH (right). Offline fairness certificates based on offline audit data fail when facing real-world distribution shifts or audit gaming, leading to significant fairness deviation--Demographic Parity (DP) fairness violation increases from 0 to 1. OATH issues an online fairness certificate of the black-box service provided to clients reliably, efficiently, and securely. During the service phase, OATH authenticates client queries and service provider responses to enable accountability without having to perform client-facing ZKPs. In the audit phase, the service provider and an auditor verify only an asymptotically constant number of client queries while providing provable guarantees on overall group fairness violation.
Figure 2: Ablation study of Audit Phase runtime for $\textsf{OATH}\xspace_\text{LR}$: (left) group-balanced uniform sampling and fairness metric computation total online queries $|Q|$ varies; (right) correctness check and consistency check as a function of number of verified user queries.
Figure 3: Green: models underneath the example fairness threshold $\theta=0.15$; Red: models placed over the by cheating which escape detection with greater than $1\%$ probability. The possible $\epsilon$ deviation that escapes detection decreases exponentially with number of verified queries by Theorem \ref{['thm:sound-fairness']}.
Figure 4: Scalability of OATH compared to baseline with varying neural network sizes. Runtimes are estimated using ZKP inference times from weng2021mystique. The baseline approach uses ZKP verified inference with each client (as in yadav2024fairprooflycklama2024holding) followed by verified group fairness computation over all queries.
Figure 5: The probability that a cheating $\mathcal{P}$ evades detection when: (Left) deviating from the fairness measurement by three fixed $\epsilon$ values at varying numbers of verified queries; (Right) deviating with 7600 verified queries with varying $\epsilon$. As $\epsilon$ gets smaller, the probability of evasion gets higher but it becomes substantially less impactful on the audit.
...and 1 more figures

Theorems & Definitions (8)

Definition 2.1
Definition 4.1
Theorem 4.1
Definition B.1
Theorem G.1
proof
Theorem G.2
Corollary G.1

Secure and Confidential Certificates of Online Fairness

TL;DR

Abstract

Secure and Confidential Certificates of Online Fairness

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (8)