MINT-Demo: Membership Inference Test Demonstrator
Daniel DeAlcala, Aythami Morales, Julian Fierrez, Gonzalo Mancera, Ruben Tolosana, Ruben Vera-Rodriguez
TL;DR
The paper addresses the need for transparency in AI training data usage under evolving governance frameworks by introducing MINT, a method that trains an auditing model using Auxiliary Auditable Data to perform membership inference. It formalizes the problem with data sets $D$ and $E$, a model $M$ producing $y = M(d|w)$, and an auditing model $T(\theta)$ that leverages $AAD = N(d|w')$ to determine data membership, presenting two architectures: Vanilla MINT (3-layer MLP) and CNN MINT (convolutional layers with FC). Empirical results show up to $89\%$ accuracy on five public datasets totaling over $22{,}000{,}000$ images, with CNN MINT outperforming Vanilla MINT and activation-layer choice significantly impacting performance. A web-based demonstrator further promotes transparency by providing actionable reports to stakeholders, highlighting practical potential for regulatory compliance and citizen rights protection in AI deployment.
Abstract
We present the Membership Inference Test Demonstrator, to emphasize the need for more transparent machine learning training processes. MINT is a technique for experimentally determining whether certain data has been used during the training of machine learning models. We conduct experiments with popular face recognition models and 5 public databases containing over 22M images. Promising results, up to 89% accuracy are achieved, suggesting that it is possible to recognize if an AI model has been trained with specific data. Finally, we present a MINT platform as demonstrator of this technology aimed to promote transparency in AI training.
