Table of Contents
Fetching ...

Noisy Neighbors: Efficient membership inference attacks against LLMs

Filippo Galli, Luca Melis, Tommaso Cucinotta

TL;DR

This paper introduces an efficient methodology that generates noisy neighbors for a target sample by adding stochastic noise in the embedding space, requiring operating the target model in inference mode only, showing its usability in practical privacy auditing scenarios.

Abstract

The potential of transformer-based LLMs risks being hindered by privacy concerns due to their reliance on extensive datasets, possibly including sensitive information. Regulatory measures like GDPR and CCPA call for using robust auditing tools to address potential privacy issues, with Membership Inference Attacks (MIA) being the primary method for assessing LLMs' privacy risks. Differently from traditional MIA approaches, often requiring computationally intensive training of additional models, this paper introduces an efficient methodology that generates \textit{noisy neighbors} for a target sample by adding stochastic noise in the embedding space, requiring operating the target model in inference mode only. Our findings demonstrate that this approach closely matches the effectiveness of employing shadow models, showing its usability in practical privacy auditing scenarios.

Noisy Neighbors: Efficient membership inference attacks against LLMs

TL;DR

This paper introduces an efficient methodology that generates noisy neighbors for a target sample by adding stochastic noise in the embedding space, requiring operating the target model in inference mode only, showing its usability in practical privacy auditing scenarios.

Abstract

The potential of transformer-based LLMs risks being hindered by privacy concerns due to their reliance on extensive datasets, possibly including sensitive information. Regulatory measures like GDPR and CCPA call for using robust auditing tools to address potential privacy issues, with Membership Inference Attacks (MIA) being the primary method for assessing LLMs' privacy risks. Differently from traditional MIA approaches, often requiring computationally intensive training of additional models, this paper introduces an efficient methodology that generates \textit{noisy neighbors} for a target sample by adding stochastic noise in the embedding space, requiring operating the target model in inference mode only. Our findings demonstrate that this approach closely matches the effectiveness of employing shadow models, showing its usability in practical privacy auditing scenarios.

Paper Structure

This paper contains 7 sections, 7 equations, 3 figures.

Figures (3)

  • Figure 1: The AUC of the thresholding classifier for MIA shows a single and prominent peak at the optimal $\sigma$ value in the noisy neighbors strategy.
  • Figure 2: Efficacy of different strategies for MIA. Confidence intervals computed with the Clopper-Person exact method.
  • Figure 3: Empirical differential privacy measured downstream of training.