ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the Privacy Risks of Machine Learning
Sasi Kumar Murakonda, Reza Shokri
TL;DR
The paper addresses the risk that machine learning models leak training data through their predictions and parameters, which is not fully captured by traditional security considerations. It introduces ML Privacy Meter, a Python library that quantifies privacy risk using membership inference attacks under black-box and white-box access, producing per-record risk scores and ROC-AUC leakage measures. The tool generates privacy reports, supports DPIA workflows, and helps compare risk across data classes and access modes, enabling practical mitigation decisions. It also discusses integration with differential privacy and parameter tuning to balance privacy guarantees with utility, referencing OpenDP and TensorFlow Privacy.
Abstract
When building machine learning models using sensitive data, organizations should ensure that the data processed in such systems is adequately protected. For projects involving machine learning on personal data, Article 35 of the GDPR mandates it to perform a Data Protection Impact Assessment (DPIA). In addition to the threats of illegitimate access to data through security breaches, machine learning models pose an additional privacy risk to the data by indirectly revealing about it through the model predictions and parameters. Guidances released by the Information Commissioner's Office (UK) and the National Institute of Standards and Technology (US) emphasize on the threat to data from models and recommend organizations to account for and estimate these risks to comply with data protection regulations. Hence, there is an immediate need for a tool that can quantify the privacy risk to data from models. In this paper, we focus on this indirect leakage about training data from machine learning models. We present ML Privacy Meter, a tool that can quantify the privacy risk to data from models through state of the art membership inference attack techniques. We discuss how this tool can help practitioners in compliance with data protection regulations, when deploying machine learning models.
