Table of Contents
Fetching ...

Smooth Sensitivity for Learning Differentially-Private yet Accurate Rule Lists

Timothée Ly, Julien Ferry, Marie-José Huguet, Sébastien Gambs, Ulrich Aivodji

TL;DR

This paper establishes the smooth sensitivity of the Gini impurity, which can be used to obtain thorough DP guarantees while adding noise scaled with tighter magnitude, and illustrates the applicability of this mechanism by integrating it within a greedy algorithm producing rule list models.

Abstract

Differentially-private (DP) mechanisms can be embedded into the design of a machine learning algorithm to protect the resulting model against privacy leakage. However, this often comes with a significant loss of accuracy due to the noise added to enforce DP. In this paper, we aim at improving this trade-off for a popular class of machine learning algorithms leveraging the Gini impurity as an information gain criterion to greedily build interpretable models such as decision trees or rule lists. To this end, we establish the smooth sensitivity of the Gini impurity, which can be used to obtain thorough DP guarantees while adding noise scaled with tighter magnitude. We illustrate the applicability of this mechanism by integrating it within a greedy algorithm producing rule list models, motivated by the fact that such models remain understudied in the DP literature. Our theoretical analysis and experimental results confirm that the DP rule lists models integrating smooth sensitivity have higher accuracy that those using other DP frameworks based on global sensitivity, for identical privacy budgets.

Smooth Sensitivity for Learning Differentially-Private yet Accurate Rule Lists

TL;DR

This paper establishes the smooth sensitivity of the Gini impurity, which can be used to obtain thorough DP guarantees while adding noise scaled with tighter magnitude, and illustrates the applicability of this mechanism by integrating it within a greedy algorithm producing rule list models.

Abstract

Differentially-private (DP) mechanisms can be embedded into the design of a machine learning algorithm to protect the resulting model against privacy leakage. However, this often comes with a significant loss of accuracy due to the noise added to enforce DP. In this paper, we aim at improving this trade-off for a popular class of machine learning algorithms leveraging the Gini impurity as an information gain criterion to greedily build interpretable models such as decision trees or rule lists. To this end, we establish the smooth sensitivity of the Gini impurity, which can be used to obtain thorough DP guarantees while adding noise scaled with tighter magnitude. We illustrate the applicability of this mechanism by integrating it within a greedy algorithm producing rule list models, motivated by the fact that such models remain understudied in the DP literature. Our theoretical analysis and experimental results confirm that the DP rule lists models integrating smooth sensitivity have higher accuracy that those using other DP frameworks based on global sensitivity, for identical privacy budgets.
Paper Structure (27 sections, 3 theorems, 32 equations, 6 figures, 3 tables, 3 algorithms)

This paper contains 27 sections, 3 theorems, 32 equations, 6 figures, 3 tables, 3 algorithms.

Key Result

Lemma 3.1

nissim$S^*_{f,\beta}(\mathcal{D}) = \max \{e^{-\beta k} \mathcal{T}_k(\mathcal{D}) \ | \ k \in \mathbb{N}\}$. (proof recalled in Appendix appendix:lemma-smooth-iterative-computation)

Figures (6)

  • Figure 1: Comparison of the amplitude (log scale) of the noise added by the Laplace mechanism scaled to either the Smooth or Global Sensitivities.
  • Figure 2: Comparison of Noisy counts and Noisy Gini versions using global sensitivity (log-scaled), applied on the Compas dataset.
  • Figure 3: Comparison of Noisy counts and Noisy Gini versions using global sensitivity (log-scaled) on the German credit and Adult datasets.
  • Figure 4: Comparison based on the test accuracy of different DP rule list algorithms.
  • Figure 5: Pipeline of Membership Inference Attack
  • ...and 1 more figures

Theorems & Definitions (4)

  • Lemma 3.1
  • Theorem 4.1: Smooth Sensitivity of the Gini impurity
  • proof
  • Theorem A.1: Post-processing theorem