LeakBoost: Perceptual-Loss-Based Membership Inference Attack
Amit Kravchik Taub, Fred M. Grabovski, Guy Amit, Yisroel Mirsky
TL;DR
LeakBoost addresses privacy risks in membership inference by introducing an active interrogation mechanism that optimizes a perceptual loss over internal activations to generate interrogation images. These boosted samples are fed into existing detectors (notably GLiR), yielding substantial gains in low-FPR leakage across CIFAR-10/100 and architectures including ViT-4 and AlexNet, with the best results arising from short, low-learning-rate optimizations and deeper representations. The method is detector-agnostic and lightweight, reframing MIA as a dynamic, interrogation-based problem and providing a practical tool for privacy assessment and defense evaluation. By linking representational geometry to memorization, LeakBoost offers insights into privacy risks in white-box settings and suggests directions for cross-modal extensions and defense research.
Abstract
Membership inference attacks (MIAs) aim to determine whether a sample was part of a model's training set, posing serious privacy risks for modern machine-learning systems. Existing MIAs primarily rely on static indicators, such as loss or confidence, and do not fully leverage the dynamic behavior of models when actively probed. We propose LeakBoost, a perceptual-loss-based interrogation framework that actively probes a model's internal representations to expose hidden membership signals. Given a candidate input, LeakBoost synthesizes an interrogation image by optimizing a perceptual (activation-space) objective, amplifying representational differences between members and non-members. This image is then analyzed by an off-the-shelf membership detector, without modifying the detector itself. When combined with existing membership inference methods, LeakBoost achieves substantial improvements at low false-positive rates across multiple image classification datasets and diverse neural network architectures. In particular, it raises AUC from near-chance levels (0.53-0.62) to 0.81-0.88, and increases TPR at 1 percent FPR by over an order of magnitude compared to strong baseline attacks. A detailed sensitivity analysis reveals that deeper layers and short, low-learning-rate optimization produce the strongest leakage, and that improvements concentrate in gradient-based detectors. LeakBoost thus offers a modular and computationally efficient way to assess privacy risks in white-box settings, advancing the study of dynamic membership inference.
