Differentially Private Selection using Smooth Sensitivity
Iago Chaves, Victor Farias, Amanda Perez, Diego Mesquita, Javam Machado
TL;DR
The paper tackles private selection under differential privacy for discrete outputs by introducing Smooth Noisy Max (SNM), a mechanism that adds noise scaled to a smooth upper bound on local sensitivity to achieve tighter utility than global-sensitivity baselines. It provides DP guarantees (via $(\varepsilon,\delta)$-DP) for SNM using admissible noise distributions and derives meaningful utility bounds, including a tail bound $\Pr[\xi(\mathscr{A},\mathbf{x})\ge t]\le |\mathscr{R}|\exp(-\varepsilon t/(4\mathscr{S}_{u,\beta}(\mathbf{x})))$. The authors apply SNM to three downstream tasks—percentile selection, greedy decision trees, and random forests—demonstrating improved accuracy and reduced error across multiple datasets, relative to EM, PF, and LD variants. This work extends smooth sensitivity to discrete private selection, offering practical DP-enabled tools that reduce noise without sacrificing privacy, and highlights directions for future improvement such as element local sensitivity and multi-objective private selection.
Abstract
Differentially private selection mechanisms offer strong privacy guarantees for queries aiming to identify the top-scoring element r from a finite set R, based on a dataset-dependent utility function. While selection queries are fundamental in data science, few mechanisms effectively ensure their privacy. Furthermore, most approaches rely on global sensitivity to achieve differential privacy (DP), which can introduce excessive noise and impair downstream inferences. To address this limitation, we propose the Smooth Noisy Max (SNM) mechanism, which leverages smooth sensitivity to yield provably tighter (upper bounds on) expected errors compared to global sensitivity-based methods. Empirical results demonstrate that SNM is more accurate than state-of-the-art differentially private selection methods in three applications: percentile selection, greedy decision trees, and random forests.
