Table of Contents
Fetching ...

On Information Theoretic Fairness: Compressed Representations With Perfect Demographic Parity

Amirreza Zamani, Borja Rodríguez-Gálvez, Mikael Skoglund

TL;DR

This work studies how to produce fair or private representations $Y$ of data $X$ that retain maximal task information $T$ while being independent of a sensitive attribute $S$ via $I(Y;S)=0$. It introduces two rate-constrained problems: (i) maximize $I(Y;T)$ with $I(Y;X)\le r$ and (ii) maximize $I(Y;T|S)$ with $I(X;Y|S,T)\le r$, interpreted as perfect privacy and perfect demographic parity problems. Building on extensions of the Functional Representation Lemma (FRL) and the Strong FRL (SFRL), the paper derives constructive mechanisms and tight bounds (e.g., $H(T|S)$ upper bound and bounds $L_1^{r}, L_2, L_3^{r}$) that quantify how utility degrades under compression, with special cases where the bounds are tight (e.g., when $S$ is a deterministic function of $T$ or $X$ is a function of $S$). The results unify fairness and privacy under information‑theoretic rate constraints and provide practical, implementable designs for obtaining fair/private representations with provable guarantees, applicable to privacy‑sensitive decision tasks and demographic parity constraints.

Abstract

In this article, we study the fundamental limits in the design of fair and/or private representations achieving perfect demographic parity and/or perfect privacy through the lens of information theory. More precisely, given some useful data $X$ that we wish to employ to solve a task $T$, we consider the design of a representation $Y$ that has no information of some sensitive attribute or secret $S$, that is, such that $I(Y;S) = 0$. We consider two scenarios. First, we consider a design desiderata where we want to maximize the information $I(Y;T)$ that the representation contains about the task, while constraining the level of compression (or encoding rate), that is, ensuring that $I(Y;X) \leq r$. Second, inspired by the Conditional Fairness Bottleneck problem, we consider a design desiderata where we want to maximize the information $I(Y;T|S)$ that the representation contains about the task which is not shared by the sensitive attribute or secret, while constraining the amount of irrelevant information, that is, ensuring that $I(Y;X|T,S) \leq r$. In both cases, we employ extended versions of the Functional Representation Lemma and the Strong Functional Representation Lemma and study the tightness of the obtained bounds. Every result here can also be interpreted as a coding with perfect privacy problem by considering the sensitive attribute as a secret.

On Information Theoretic Fairness: Compressed Representations With Perfect Demographic Parity

TL;DR

This work studies how to produce fair or private representations of data that retain maximal task information while being independent of a sensitive attribute via . It introduces two rate-constrained problems: (i) maximize with and (ii) maximize with , interpreted as perfect privacy and perfect demographic parity problems. Building on extensions of the Functional Representation Lemma (FRL) and the Strong FRL (SFRL), the paper derives constructive mechanisms and tight bounds (e.g., upper bound and bounds ) that quantify how utility degrades under compression, with special cases where the bounds are tight (e.g., when is a deterministic function of or is a function of ). The results unify fairness and privacy under information‑theoretic rate constraints and provide practical, implementable designs for obtaining fair/private representations with provable guarantees, applicable to privacy‑sensitive decision tasks and demographic parity constraints.

Abstract

In this article, we study the fundamental limits in the design of fair and/or private representations achieving perfect demographic parity and/or perfect privacy through the lens of information theory. More precisely, given some useful data that we wish to employ to solve a task , we consider the design of a representation that has no information of some sensitive attribute or secret , that is, such that . We consider two scenarios. First, we consider a design desiderata where we want to maximize the information that the representation contains about the task, while constraining the level of compression (or encoding rate), that is, ensuring that . Second, inspired by the Conditional Fairness Bottleneck problem, we consider a design desiderata where we want to maximize the information that the representation contains about the task which is not shared by the sensitive attribute or secret, while constraining the amount of irrelevant information, that is, ensuring that . In both cases, we employ extended versions of the Functional Representation Lemma and the Strong Functional Representation Lemma and study the tightness of the obtained bounds. Every result here can also be interpreted as a coding with perfect privacy problem by considering the sensitive attribute as a secret.
Paper Structure (12 sections, 4 theorems, 38 equations, 2 figures)

This paper contains 12 sections, 4 theorems, 38 equations, 2 figures.

Key Result

Theorem 1

For every compression level $0\leq r\leq H(X|S)$ and random variables $(S, X, T) \sim P_{S,X,T}$, we have that where and $\alpha=r/H(X|S)$. Moreover, for $H(X|S)\leq r< H(X)$ we have that where ${L'}_1^{r} = H(T|S)-H(S|T)=H(T)-H(S).$

Figures (2)

  • Figure 1: Data representation with perfect demographic parity or privacy. We want to design a representation $Y$ of the data $X$ that is useful for the task $T$, is compressed, and independent of the sensitive attribute or secret $S$.
  • Figure 2: Information diagram yeung1991new of the CFB with perfect demographic parity. In light gray, we show the relevant information about the task $T$, not shared by the sensitive attribute $S$, that we want the representation $Y$ to maximize. In dark gray, we show the irrelevant information about the data $X$ that we want to constrain. Contrary to the standard CFB vari, we enforce that the representation contains no information about the sensitive attribute (perfect demographic parity or perfect privacy).

Theorems & Definitions (13)

  • Remark 1
  • Remark 2
  • Theorem 1
  • Corollary 1
  • Remark 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Example 1
  • Theorem 2
  • ...and 3 more