Curvature Clues: Decoding Deep Learning Privacy with Input Loss Curvature
Deepak Ravikumar, Efstathia Soufleri, Kaushik Roy
TL;DR
This work investigates input loss curvature, defined as $Curv_\phi(z_i,S)=\mathrm{tr}(\nabla^2_{z_i} \ell(h^{\phi}_S,z_i))$, as a signal for distinguishing train from test samples and for membership inference in deep networks. It develops a theoretical framework that bounds train-test distinguishability via KL divergence, linking bounds to the privacy parameter $\epsilon$ and training size $m$, and shows curvature-based MIA can surpass probability-based methods on sufficiently large datasets. To enable privacy testing in black-box settings, the authors introduce a zero-order curvature estimator that uses shadow models and likelihood-ratio tests (LR/NLL) to perform MIA without access to model parameters. Empirical validation on CIFAR10/100 and ImageNet demonstrates superior performance of curvature-based MIAs over state-of-the-art techniques, and confirms the predicted dependence on dataset size, privacy, and dataset composition. These findings advance understanding of how input geometry relates to memorization and privacy and provide practical tools for evaluating privacy-preserving techniques in vision models.
Abstract
In this paper, we explore the properties of loss curvature with respect to input data in deep neural networks. Curvature of loss with respect to input (termed input loss curvature) is the trace of the Hessian of the loss with respect to the input. We investigate how input loss curvature varies between train and test sets, and its implications for train-test distinguishability. We develop a theoretical framework that derives an upper bound on the train-test distinguishability based on privacy and the size of the training set. This novel insight fuels the development of a new black box membership inference attack utilizing input loss curvature. We validate our theoretical findings through experiments in computer vision classification tasks, demonstrating that input loss curvature surpasses existing methods in membership inference effectiveness. Our analysis highlights how the performance of membership inference attack (MIA) methods varies with the size of the training set, showing that curvature-based MIA outperforms other methods on sufficiently large datasets. This condition is often met by real datasets, as demonstrated by our results on CIFAR10, CIFAR100, and ImageNet. These findings not only advance our understanding of deep neural network behavior but also improve the ability to test privacy-preserving techniques in machine learning.
