Table of Contents
Fetching ...

Privacy for Fairness: Information Obfuscation for Fair Representation Learning with Local Differential Privacy

Songjie Xie, Youlong Wu, Jiaxuan Li, Ming Ding, Khaled B. Letaief

TL;DR

This work tackles the interdependent goals of privacy and fairness in machine learning by proposing an information bottleneck–based information obfuscation framework that leverages local differential privacy (LDP) during representation learning. It introduces a non-adversarial variational encoding approach that can be trained without a variational prior and demonstrates that LDP randomizers constrain sensitive information leakage while preserving utility within the IB objective. The authors prove theoretical relations showing that, under an ε-LDP budget, one can achieve a controllable utility–leakage tradeoff (Γ,Ω) with Γ bounded by γ and Ω bounded by ε minus the residual mutual information I(X;Z|S). Empirical results on colored-MNIST and real-world datasets (Adult, Compas, HSLS) validate that the proposed method attains both LDP and fair representations with competitive utility, underscoring the practical viability of privacy-aware fairness preprocessing.

Abstract

As machine learning (ML) becomes more prevalent in human-centric applications, there is a growing emphasis on algorithmic fairness and privacy protection. While previous research has explored these areas as separate objectives, there is a growing recognition of the complex relationship between privacy and fairness. However, previous works have primarily focused on examining the interplay between privacy and fairness through empirical investigations, with limited attention given to theoretical exploration. This study aims to bridge this gap by introducing a theoretical framework that enables a comprehensive examination of their interrelation. We shall develop and analyze an information bottleneck (IB) based information obfuscation method with local differential privacy (LDP) for fair representation learning. In contrast to many empirical studies on fairness in ML, we show that the incorporation of LDP randomizers during the encoding process can enhance the fairness of the learned representation. Our analysis will demonstrate that the disclosure of sensitive information is constrained by the privacy budget of the LDP randomizer, thereby enabling the optimization process within the IB framework to effectively suppress sensitive information while preserving the desired utility through obfuscation. Based on the proposed method, we further develop a variational representation encoding approach that simultaneously achieves fairness and LDP. Our variational encoding approach offers practical advantages. It is trained using a non-adversarial method and does not require the introduction of any variational prior. Extensive experiments will be presented to validate our theoretical results and demonstrate the ability of our proposed approach to achieve both LDP and fairness while preserving adequate utility.

Privacy for Fairness: Information Obfuscation for Fair Representation Learning with Local Differential Privacy

TL;DR

This work tackles the interdependent goals of privacy and fairness in machine learning by proposing an information bottleneck–based information obfuscation framework that leverages local differential privacy (LDP) during representation learning. It introduces a non-adversarial variational encoding approach that can be trained without a variational prior and demonstrates that LDP randomizers constrain sensitive information leakage while preserving utility within the IB objective. The authors prove theoretical relations showing that, under an ε-LDP budget, one can achieve a controllable utility–leakage tradeoff (Γ,Ω) with Γ bounded by γ and Ω bounded by ε minus the residual mutual information I(X;Z|S). Empirical results on colored-MNIST and real-world datasets (Adult, Compas, HSLS) validate that the proposed method attains both LDP and fair representations with competitive utility, underscoring the practical viability of privacy-aware fairness preprocessing.

Abstract

As machine learning (ML) becomes more prevalent in human-centric applications, there is a growing emphasis on algorithmic fairness and privacy protection. While previous research has explored these areas as separate objectives, there is a growing recognition of the complex relationship between privacy and fairness. However, previous works have primarily focused on examining the interplay between privacy and fairness through empirical investigations, with limited attention given to theoretical exploration. This study aims to bridge this gap by introducing a theoretical framework that enables a comprehensive examination of their interrelation. We shall develop and analyze an information bottleneck (IB) based information obfuscation method with local differential privacy (LDP) for fair representation learning. In contrast to many empirical studies on fairness in ML, we show that the incorporation of LDP randomizers during the encoding process can enhance the fairness of the learned representation. Our analysis will demonstrate that the disclosure of sensitive information is constrained by the privacy budget of the LDP randomizer, thereby enabling the optimization process within the IB framework to effectively suppress sensitive information while preserving the desired utility through obfuscation. Based on the proposed method, we further develop a variational representation encoding approach that simultaneously achieves fairness and LDP. Our variational encoding approach offers practical advantages. It is trained using a non-adversarial method and does not require the introduction of any variational prior. Extensive experiments will be presented to validate our theoretical results and demonstrate the ability of our proposed approach to achieve both LDP and fairness while preserving adequate utility.
Paper Structure (44 sections, 5 theorems, 30 equations, 6 figures, 5 tables)

This paper contains 44 sections, 5 theorems, 30 equations, 6 figures, 5 tables.

Key Result

Lemma 1

For any $P_{\hat{Z}|X}$, an $\epsilon$-LDP mechanism $\mathcal{M}: \mathcal{\hat{Z}} \to \mathcal{Z}$ induces a mapping $P_{Z|X}$ that satisfies $\epsilon$-LDP.

Figures (6)

  • Figure 1: The considered scenario of fair representation learning for information obfuscation with local differential privacy.
  • Figure 2: Relation between $\epsilon$, $\Omega$, and $\nu^*$.
  • Figure 3: The proposed variational encoding framework.
  • Figure 4: The encoded representations with $\epsilon$-LDP Laplace mechanism for random samples in Colored-MNIST dataset. As $\epsilon$ decreases, the representations on the left become increasingly noisy, leading to more obfuscation of the color attribute (e.g., samples enclosed by a red box). Meanwhile, the representations on the right also exhibit increased noise, yet the color correlation persists (e.g., samples enclosed by a blue box).
  • Figure 5: Accuracy-$\Delta_{\text{DP}}$ tradeoffs with varying values of $\epsilon \in [0, 10^3]$ and $\beta \in [0.1, 10^3]$ on (a) adult, (b) compas, and (c) hsls datasets.
  • ...and 1 more figures

Theorems & Definitions (19)

  • Definition 1: Local differential privacy
  • Definition 2: Utility-leakage pair
  • Definition 3: Optimality
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Theorem 1
  • proof
  • Corollary 1.1
  • ...and 9 more