Table of Contents
Fetching ...

MINT-Demo: Membership Inference Test Demonstrator

Daniel DeAlcala, Aythami Morales, Julian Fierrez, Gonzalo Mancera, Ruben Tolosana, Ruben Vera-Rodriguez

TL;DR

The paper addresses the need for transparency in AI training data usage under evolving governance frameworks by introducing MINT, a method that trains an auditing model using Auxiliary Auditable Data to perform membership inference. It formalizes the problem with data sets $D$ and $E$, a model $M$ producing $y = M(d|w)$, and an auditing model $T(\theta)$ that leverages $AAD = N(d|w')$ to determine data membership, presenting two architectures: Vanilla MINT (3-layer MLP) and CNN MINT (convolutional layers with FC). Empirical results show up to $89\%$ accuracy on five public datasets totaling over $22{,}000{,}000$ images, with CNN MINT outperforming Vanilla MINT and activation-layer choice significantly impacting performance. A web-based demonstrator further promotes transparency by providing actionable reports to stakeholders, highlighting practical potential for regulatory compliance and citizen rights protection in AI deployment.

Abstract

We present the Membership Inference Test Demonstrator, to emphasize the need for more transparent machine learning training processes. MINT is a technique for experimentally determining whether certain data has been used during the training of machine learning models. We conduct experiments with popular face recognition models and 5 public databases containing over 22M images. Promising results, up to 89% accuracy are achieved, suggesting that it is possible to recognize if an AI model has been trained with specific data. Finally, we present a MINT platform as demonstrator of this technology aimed to promote transparency in AI training.

MINT-Demo: Membership Inference Test Demonstrator

TL;DR

The paper addresses the need for transparency in AI training data usage under evolving governance frameworks by introducing MINT, a method that trains an auditing model using Auxiliary Auditable Data to perform membership inference. It formalizes the problem with data sets and , a model producing , and an auditing model that leverages to determine data membership, presenting two architectures: Vanilla MINT (3-layer MLP) and CNN MINT (convolutional layers with FC). Empirical results show up to accuracy on five public datasets totaling over images, with CNN MINT outperforming Vanilla MINT and activation-layer choice significantly impacting performance. A web-based demonstrator further promotes transparency by providing actionable reports to stakeholders, highlighting practical potential for regulatory compliance and citizen rights protection in AI deployment.

Abstract

We present the Membership Inference Test Demonstrator, to emphasize the need for more transparent machine learning training processes. MINT is a technique for experimentally determining whether certain data has been used during the training of machine learning models. We conduct experiments with popular face recognition models and 5 public databases containing over 22M images. Promising results, up to 89% accuracy are achieved, suggesting that it is possible to recognize if an AI model has been trained with specific data. Finally, we present a MINT platform as demonstrator of this technology aimed to promote transparency in AI training.

Paper Structure

This paper contains 5 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Block diagram of the Membership Inference Test.
  • Figure 2: The MINT Model ($T$) predicts whether specific data ($d$) was used to train an Audited AI Model ($M$), using Auxiliary Auditable Data (e.g., activation maps) and/or the model outcome from $M$.