Secure and Confidential Certificates of Online Fairness
Olive Franzese, Ali Shahin Shamsabadi, Carter Luck, Hamed Haddadi
TL;DR
This work tackles verifying ML fairness under confidentiality by introducing OATH, a privacy-preserving online group-fairness certificate. It uses a two-phase protocol with commitments and zero-knowledge proofs, plus a cut-and-choose audit to achieve scalability for large deployment data while protecting client and model confidentiality. Theoretical guarantees show exponential reduction in cheating risk as the audited sample grows, and empirical results across diverse datasets and model types demonstrate practical runtimes and substantial improvements over baselines. The approach enables reliable, scalable, and confidential auditing of fairness in deployed ML services, with clear paths for extending to other online properties and metrics.
Abstract
The black-box service model enables ML service providers to serve clients while keeping their intellectual property and client data confidential. Confidentiality is critical for delivering ML services legally and responsibly, but makes it difficult for outside parties to verify important model properties such as fairness. Existing methods that assess model fairness confidentially lack either (i) reliability because they certify fairness with respect to a static set of data, and therefore fail to guarantee fairness in the presence of distribution shift or service provider malfeasance; and/or (ii) scalability due to the computational overhead of confidentiality-preserving cryptographic primitives. We address these problems by introducing online fairness certificates, which verify that a model is fair with respect to data received by the service provider online during deployment. We then present OATH, a deployably efficient and scalable zero-knowledge proof protocol for confidential online group fairness certification. OATH exploits statistical properties of group fairness via a cut-and-choose style protocol, enabling scalability improvements over baselines.
