Differentially Private Prototypes for Imbalanced Transfer Learning

Dariush Wahdany; Matthew Jagielski; Adam Dziedzic; Franziska Boenisch

Differentially Private Prototypes for Imbalanced Transfer Learning

Dariush Wahdany, Matthew Jagielski, Adam Dziedzic, Franziska Boenisch

TL;DR

The paper tackles DP leakage in private learning under strict privacy and imbalanced data by introducing Differentially Private Prototype Learning (DPPL), which uses features from publicly pre-trained encoders to form per-class DP prototypes. DPPL avoids iterative noise addition by either computing private class means (DPPL-Mean) or privately selecting public prototypes (DPPL-Public) via the Exponential Mechanism, enabling efficient, post-processed public release of prototypes. Empirical results across four encoders and four vision datasets show DPPL delivering strong utility in high-privacy regimes and when data is imbalanced, with DPPL-Public often outperforming state-of-the-art baselines and DPPL-Mean performing well at higher privacy budgets. The work demonstrates that leveraging public data beyond pre-training, along with prototype-based transfer, can significantly boost private learning performance, particularly for minority classes, and highlights practical considerations such as encoder choice and public-data size for deployment.

Abstract

Machine learning (ML) models have been shown to leak private information from their training datasets. Differential Privacy (DP), typically implemented through the differential private stochastic gradient descent algorithm (DP-SGD), has become the standard solution to bound leakage from the models. Despite recent improvements, DP-SGD-based approaches for private learning still usually struggle in the high privacy ($\varepsilon\le1)$ and low data regimes, and when the private training datasets are imbalanced. To overcome these limitations, we propose Differentially Private Prototype Learning (DPPL) as a new paradigm for private transfer learning. DPPL leverages publicly pre-trained encoders to extract features from private data and generates DP prototypes that represent each private class in the embedding space and can be publicly released for inference. Since our DP prototypes can be obtained from only a few private training data points and without iterative noise addition, they offer high-utility predictions and strong privacy guarantees even under the notion of \textit{pure DP}. We additionally show that privacy-utility trade-offs can be further improved when leveraging the public data beyond pre-training of the encoder: in particular, we can privately sample our DP prototypes from the publicly available data points used to train the encoder. Our experimental evaluation with four state-of-the-art encoders, four vision datasets, and under different data and imbalancedness regimes demonstrate DPPL's high performance under strong privacy guarantees in challenging private learning setups

Differentially Private Prototypes for Imbalanced Transfer Learning

TL;DR

Abstract

and low data regimes, and when the private training datasets are imbalanced. To overcome these limitations, we propose Differentially Private Prototype Learning (DPPL) as a new paradigm for private transfer learning. DPPL leverages publicly pre-trained encoders to extract features from private data and generates DP prototypes that represent each private class in the embedding space and can be publicly released for inference. Since our DP prototypes can be obtained from only a few private training data points and without iterative noise addition, they offer high-utility predictions and strong privacy guarantees even under the notion of \textit{pure DP}. We additionally show that privacy-utility trade-offs can be further improved when leveraging the public data beyond pre-training of the encoder: in particular, we can privately sample our DP prototypes from the publicly available data points used to train the encoder. Our experimental evaluation with four state-of-the-art encoders, four vision datasets, and under different data and imbalancedness regimes demonstrate DPPL's high performance under strong privacy guarantees in challenging private learning setups

Paper Structure (52 sections, 8 theorems, 20 equations, 29 figures, 4 tables)

This paper contains 52 sections, 8 theorems, 20 equations, 29 figures, 4 tables.

Introduction
Background
Related Work
Private Transfer Learning.
Differentially Private Prototyping
DPPL-Mean: Private Means
DPPL-Public: Privately Selecting Public Prototypes
Empirical Evaluation
DP Prototypes: High Utility in High Privacy and Extreme Imbalance
DP Prototypes Improve over State-of-the-Art Baselines in Imbalanced Setups
Understanding the Success of DP Prototypes
Effect of the Publicly Pre-trained Encoder.
Impact of the Projection Layer.
Improving through Multiple Per-Class Prototypes.
Impact of the Public Data for Prototype Selection.
...and 37 more sections

Key Result

Theorem 1

Let $M_i$ each provide $\epsilon$-differential privacy. Let $D_i$ be arbitrary disjoint subsets of the input domain $D$. The sequence of $M_i\left(X \cap D_i\right)$ provides $\epsilon$-differential privacy.

Figures (29)

Figure 1: Overview of DPPL. We split the private data $\mathbf{X}$ per class $c$ into $\mathbf{X}_c$'s, infer them through a publicly pre-trained encoder, and estimate per-class prototypes $\mathbf{p}_c$ in the embedding space with DP. Classification of samples is performed by returning the label of the closest prototype $\mathbf{p}_c$ in the embedding space according to some distance function $d$.
Figure 2: DP Prototypes on CIFAR100. We present the balanced test accuracy of our methods vs. standard linear probing with DP-SGD on CIFAR100 and ViT-H-14 at different levels of imbalance rations (IR), using ImageNet as public data for DPPL-Public. We plot the mean test accuracy over multiple runs and represent the upper and lower quantiles for all methods by the dotted lines.
Figure 3: DP Prototypes on various imbalanced datasets. We present the balanced test accuracy for CIFAR10, CIFAR100, FOOD101 and STL10 at an imbalance ratio of $100$ on ViT-H-14, using ImageNet as public data for DPPL-Public. We compare to standard Linear Probing with DP-SGD. We plot the mean test accuracy over multiple runs and represent the upper/lower quantiles by the dotted lines. \ref{['sub:appendix-imbalanced']} shows more results.
Figure 4: Comparing against baselines on CIFAR100. We present the results of our methods vs. state-of-the-art methods (DP-LS and DPSGD-Global-Adapt) on the CIFAR100 dataset using ViT-H-14 under different IRs. DPPL-Public uses ImageNet as public data. Dotted lines represent the upper/lower quantiles. Similar results for CIFAR10, Food101, and STL10 are presented in Appendix E.1.
Figure 5: Accuracies of the minority classes. We depict the test accuracy on CIFAR100 with ViT-H-14 embeddings for the minority classes (smallest $25\%$ of classes) at $\text{IR}=50$.
...and 24 more figures

Theorems & Definitions (12)

Definition 1: $(\xi, \rho)$-zCDP from bunConcentratedDifferentialPrivacy2016
Definition 2: $(\alpha, \epsilon)$-RDP from mironovRenyiDifferentialPrivacy2017
Definition 3: $\mu$-GDP from dongGaussianDifferentialPrivacy2019
Theorem 1: Parallel composition from mcsherryPrivacyIntegratedQueries2009
Proposition 2: cesarBoundingConcentratingTruncating2021
Lemma 1: mcsherry2007mechanism
Definition 4
Lemma 2: mcsherry2007mechanism
Lemma 3: cesarBoundingConcentratingTruncating2021
Lemma 4
...and 2 more

Differentially Private Prototypes for Imbalanced Transfer Learning

TL;DR

Abstract

Differentially Private Prototypes for Imbalanced Transfer Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (29)

Theorems & Definitions (12)