Table of Contents
Fetching ...

On the Alignment of Group Fairness with Attribute Privacy

Jan Aalmoes, Vasisht Duddu, Antoine Boutet

TL;DR

The paper addresses how group fairness relates to attribute privacy in machine learning by showing that fairness constraints induce output indistinguishability that protects against attribute inference attacks (AIAs). It introduces AdaptAIA, an attack tailored for imbalanced real-world data, and demonstrates that two standard fairness methods, Exponentiated Gradient Descent (EGD) with Demographic Parity and Adversarial Debiasing (AdvDebias), can suppress AdaptAIA’s success. Theoretical guarantees show that EGD + DemPar bounds attack accuracy to a random guess when DemPar-Level is zero, while empirical results across COMPAS, CENSUS, MEPS, and LFW confirm substantial defense against both soft and hard outputs, albeit with utility costs. The findings suggest that enforcing group fairness can serve as a practical, cost-effective defense against attribute inference in blackbox settings, highlighting output indistinguishability as a general privacy-fairness principle.

Abstract

Group fairness and privacy are fundamental aspects in designing trustworthy machine learning models. Previous research has highlighted conflicts between group fairness and different privacy notions. We are the first to demonstrate the alignment of group fairness with the specific privacy notion of attribute privacy in a blackbox setting. Attribute privacy, quantified by the resistance to attribute inference attacks (AIAs), requires indistinguishability in the target model's output predictions. Group fairness guarantees this thereby mitigating AIAs and achieving attribute privacy. To demonstrate this, we first introduce AdaptAIA, an enhancement of existing AIAs, tailored for real-world datasets with class imbalances in sensitive attributes. Through theoretical and extensive empirical analyses, we demonstrate the efficacy of two standard group fairness algorithms (i.e., adversarial debiasing and exponentiated gradient descent) against AdaptAIA. Additionally, since using group fairness results in attribute privacy, it acts as a defense against AIAs, which is currently lacking. Overall, we show that group fairness aligns with attribute privacy at no additional cost other than the already existing trade-off with model utility.

On the Alignment of Group Fairness with Attribute Privacy

TL;DR

The paper addresses how group fairness relates to attribute privacy in machine learning by showing that fairness constraints induce output indistinguishability that protects against attribute inference attacks (AIAs). It introduces AdaptAIA, an attack tailored for imbalanced real-world data, and demonstrates that two standard fairness methods, Exponentiated Gradient Descent (EGD) with Demographic Parity and Adversarial Debiasing (AdvDebias), can suppress AdaptAIA’s success. Theoretical guarantees show that EGD + DemPar bounds attack accuracy to a random guess when DemPar-Level is zero, while empirical results across COMPAS, CENSUS, MEPS, and LFW confirm substantial defense against both soft and hard outputs, albeit with utility costs. The findings suggest that enforcing group fairness can serve as a practical, cost-effective defense against attribute inference in blackbox settings, highlighting output indistinguishability as a general privacy-fairness principle.

Abstract

Group fairness and privacy are fundamental aspects in designing trustworthy machine learning models. Previous research has highlighted conflicts between group fairness and different privacy notions. We are the first to demonstrate the alignment of group fairness with the specific privacy notion of attribute privacy in a blackbox setting. Attribute privacy, quantified by the resistance to attribute inference attacks (AIAs), requires indistinguishability in the target model's output predictions. Group fairness guarantees this thereby mitigating AIAs and achieving attribute privacy. To demonstrate this, we first introduce AdaptAIA, an enhancement of existing AIAs, tailored for real-world datasets with class imbalances in sensitive attributes. Through theoretical and extensive empirical analyses, we demonstrate the efficacy of two standard group fairness algorithms (i.e., adversarial debiasing and exponentiated gradient descent) against AdaptAIA. Additionally, since using group fairness results in attribute privacy, it acts as a defense against AIAs, which is currently lacking. Overall, we show that group fairness aligns with attribute privacy at no additional cost other than the already existing trade-off with model utility.
Paper Structure (18 sections, 6 theorems, 6 equations, 18 figures, 3 tables)

This paper contains 18 sections, 6 theorems, 6 equations, 18 figures, 3 tables.

Key Result

Theorem 6.1

The maximum attack accuracy achievable by AdaptAIA-H is equal to $\frac{1}{2}(1+\text{DemPar\xspace-Level of }f_{trg}\xspace)$.

Figures (18)

  • Figure 1: Privacy is violated if an (potentially sensitive) attribute is inferred from the output predictions of a learning model if these predictions are distinguishable for different attribute values (Appendix \ref{['app:distinguishability']}).
  • Figure 2: $\mathcal{A}dv$ wants to infer $S(\omega)$ for $X(\omega)$ given $f_{trg}\xspace(X(\omega))$. $\mathcal{A}dv$ uses $f_{att}\xspace$ which is trained on $\mathcal{D}_{aux}\xspace$ to map $S'(\omega)$ from $f_{trg}\xspace(X'(\omega))$.
  • Figure 3: ROC curve: $\upsilon^*$ can lower FPR to infer Race.
  • Figure 4: For AdaptAIA-H, we observe that EGD reduces the attack accuracy to random guess ($\sim$50%).
  • Figure 5: Imposing fairness with EGD + DemPar has a significant impact on the accuracy of $f_{trg}\xspace$, matching with the observation from prior work reductions
  • ...and 13 more figures

Theorems & Definitions (12)

  • Definition 2.1
  • Definition 2.2
  • Theorem 6.1
  • Theorem 7.1
  • Theorem \ref{th:dpgood}
  • proof
  • Definition D.1
  • Theorem \ref{th:advdebias}
  • proof
  • Theorem E.1
  • ...and 2 more