Table of Contents
Fetching ...

Private Selection with Heterogeneous Sensitivities

Daniela Antonova, Allegra Laro, Audra McMillan, Lorenz Wolf

TL;DR

The paper addresses private selection when candidate sensitivities are heterogeneous, showing that a correlation between scores and sensitivities can guide the choice of DP mechanism. It introduces and analyzes GEM, mGEM, and RS_\gamma, demonstrating that no single method uniformly dominates RNM across all settings, but a correlation-driven adaptive approach often yields gains. The authors propose Combined GEM to adaptively switch between GEM variants based on privately estimated correlation and validate the ideas with synthetic experiments and real-world data using SANSA-predicted scores. They also show that private selection can confer advantages in online bandits under distribution shift, highlighting practical impact for DP-enabled model selection, recommendation, and sequential decision tasks.

Abstract

Differentially private (DP) selection involves choosing a high-scoring candidate from a finite candidate pool, where each score depends on a sensitive dataset. This problem arises naturally in a variety of contexts including model selection, hypothesis testing, and within many DP algorithms. Classical methods, such as Report Noisy Max (RNM), assume all candidates' scores are equally sensitive to changes in a single individual's data, but this often isn't the case. To address this, algorithms like the Generalised Exponential Mechanism (GEM) leverage variability in candidate sensitivities. However, we observe that while these algorithms can outperform RNM in some situations, they may underperform in others - they can even perform worse than random selection. In this work, we explore how the distribution of scores and sensitivities impacts DP selection mechanisms. In all settings we study, we find that there exists a mechanism that utilises heterogeneity in the candidate sensitivities that outperforms standard mechanisms like RNM. However, no single mechanism uniformly outperforms RNM. We propose using the correlation between the scores and sensitivities as the basis for deciding which DP selection mechanism to use. Further, we design a slight variant of GEM, modified GEM that generally performs well whenever GEM performs poorly. Relying on the correlation heuristic we propose combined GEM, which adaptively chooses between GEM and modified GEM and outperforms both in polarised settings.

Private Selection with Heterogeneous Sensitivities

TL;DR

The paper addresses private selection when candidate sensitivities are heterogeneous, showing that a correlation between scores and sensitivities can guide the choice of DP mechanism. It introduces and analyzes GEM, mGEM, and RS_\gamma, demonstrating that no single method uniformly dominates RNM across all settings, but a correlation-driven adaptive approach often yields gains. The authors propose Combined GEM to adaptively switch between GEM variants based on privately estimated correlation and validate the ideas with synthetic experiments and real-world data using SANSA-predicted scores. They also show that private selection can confer advantages in online bandits under distribution shift, highlighting practical impact for DP-enabled model selection, recommendation, and sequential decision tasks.

Abstract

Differentially private (DP) selection involves choosing a high-scoring candidate from a finite candidate pool, where each score depends on a sensitive dataset. This problem arises naturally in a variety of contexts including model selection, hypothesis testing, and within many DP algorithms. Classical methods, such as Report Noisy Max (RNM), assume all candidates' scores are equally sensitive to changes in a single individual's data, but this often isn't the case. To address this, algorithms like the Generalised Exponential Mechanism (GEM) leverage variability in candidate sensitivities. However, we observe that while these algorithms can outperform RNM in some situations, they may underperform in others - they can even perform worse than random selection. In this work, we explore how the distribution of scores and sensitivities impacts DP selection mechanisms. In all settings we study, we find that there exists a mechanism that utilises heterogeneity in the candidate sensitivities that outperforms standard mechanisms like RNM. However, no single mechanism uniformly outperforms RNM. We propose using the correlation between the scores and sensitivities as the basis for deciding which DP selection mechanism to use. Further, we design a slight variant of GEM, modified GEM that generally performs well whenever GEM performs poorly. Relying on the correlation heuristic we propose combined GEM, which adaptively chooses between GEM and modified GEM and outperforms both in polarised settings.
Paper Structure (43 sections, 27 equations, 18 figures, 2 algorithms)

This paper contains 43 sections, 27 equations, 18 figures, 2 algorithms.

Figures (18)

  • Figure 1: Analysis of how the distribution of scores and candidate-wise sensitivities affects the relative performance of selection algorithms, in three simple scenarios. The figures in the top row show each candidate's scores (dark purple dot) and sensitivities (light purple vertical line). The figures in the second row show the performance (in mean squared error relative to the best candidate) of different private selection algorithms as a function of the privacy parameter $\epsilon$.
  • Figure 2: Comparing the performance of $\text{RNM}$, RMNH and $\text{RS}_{\gamma}$ for varyiing different sensitivities $\Delta_1$ (on the horizontal axis) and $\Delta_2$ (on the vertical axis). Here $\text{RS}_{\gamma}$ is run with $\gamma=0.01$, $\epsilon=0.1$ and scores $q_1=0, q_2=1$.
  • Figure 3: Analysing the impact of correlation between scores and sensitivities on the behavior of GEM and mGEM. The centre black line is $q_a$, the top (pink) line is the function used to reorder candidates for mGEM, and the bottom purple line is the function used to reorder candidates for GEM.
  • Figure 4: Figure \ref{['fig:scenario11']} and \ref{['fig:scenario21']} in the top row show the sensitivities plotted against scores (Scenario 1 has $18$ candidates overlaid with score $1$ and sensitivity $0.1$). Figures \ref{['fig:scenario1perf']} and \ref{['fig:scenario2perf']} show mechanism performance in terms of MSE for different values of the privacy parameter $\epsilon$. In Figures \ref{['fig:scenario1GEM']}-\ref{['fig:scenario2mGEM']}, candidates are weighted by how likely the specified mechanism is to choose that candidate (with $\epsilon=0.05$).
  • Figure 5: Analysis of how the distribution of scores and candidate-wise sensitivities affects the relative performance of algorithms in three slightly more realistic scenarios, with some randomness in the scores. The figures in the top row show the distributions according to which scores and sensitivities are obtained. Each mean score is shown as a red dot and each sensitivity as a blue vertical line. The figures in the second row show mechanism performance in terms of MSE for different values of the privacy parameter $\epsilon$.
  • ...and 13 more figures

Theorems & Definitions (2)

  • Definition 3.1: Differential Privacy Dwork_2006Foundations
  • Definition 1: Truncated Negative Binomial Distribution