EL-MIA: Quantifying Membership Inference Risks of Sensitive Entities in LLMs
Ali Satvaty, Suzan Verberne, Fatih Turkmen
TL;DR
This work defines EL-MIA, a fine-grained membership inference task focused on sensitive entities in LLMs and provides a benchmark built on AI4Privacy to evaluate entity-level leakage. It shows that traditional MIA methods struggle to detect entity-level membership, and introduces two targeted attacks—Reference-set normalization and suffix scoring—that significantly improve detection across model scales and training dynamics. The study analyzes how factors such as token-length cues, attribute type, and training epochs influence vulnerability, revealing surprising dynamics (e.g., early-model susceptibility and later epoch amplification). By releasing benchmarks and checkpoints, it enables reproducible assessment and emphasizes the need for standardized PII exposure testing and stronger defenses for real-world LLM deployments.
Abstract
Membership inference attacks (MIA) aim to infer whether a particular data point is part of the training dataset of a model. In this paper, we propose a new task in the context of LLM privacy: entity-level discovery of membership risk focused on sensitive information (PII, credit card numbers, etc). Existing methods for MIA can detect the presence of entire prompts or documents in the LLM training data, but they fail to capture risks at a finer granularity. We propose the ``EL-MIA'' framework for auditing entity-level membership risks in LLMs. We construct a benchmark dataset for the evaluation of MIA methods on this task. Using this benchmark, we conduct a systematic comparison of existing MIA techniques as well as two newly proposed methods. We provide a comprehensive analysis of the results, trying to explain the relation of the entity level MIA susceptability with the model scale, training epochs, and other surface level factors. Our findings reveal that existing MIA methods are limited when it comes to entity-level membership inference of the sensitive attributes, while this susceptibility can be outlined with relatively straightforward methods, highlighting the need for stronger adversaries to stress test the provided threat model.
