Table of Contents
Fetching ...

Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation

Jensen Hwa, Qingyu Zhao, Aditya Lahiri, Adnan Masood, Babak Salimi, Ehsan Adeli

TL;DR

This paper tackles enforcing conditional independence to achieve fair and unconfounded representations and enable causal image generation. It recasts CI as an equality of Jensen-Shannon divergences and introduces a dynamic sampling scheme to apply CI in high-dimensional latent spaces, applicable to any encoder. The authors demonstrate that enforcing CI in the latent space (v-space) yields better fairness-accuracy trade-offs than label-space CI and enables race-controlled generation through partial CI in diffusion autoencoders. The approach provides a model-agnostic training paradigm with practical impact on fair predictive modeling and controllable image synthesis in high-dimensional settings.

Abstract

Conditional independence (CI) constraints are critical for defining and evaluating fairness in machine learning, as well as for learning unconfounded or causal representations. Traditional methods for ensuring fairness either blindly learn invariant features with respect to a protected variable (e.g., race when classifying sex from face images) or enforce CI relative to the protected attribute only on the model output (e.g., the sex label). Neither of these methods are effective in enforcing CI in high-dimensional feature spaces. In this paper, we focus on a nascent approach characterizing the CI constraint in terms of two Jensen-Shannon divergence terms, and we extend it to high-dimensional feature spaces using a novel dynamic sampling strategy. In doing so, we introduce a new training paradigm that can be applied to any encoder architecture. We are able to enforce conditional independence of the diffusion autoencoder latent representation with respect to any protected attribute under the equalized odds constraint and show that this approach enables causal image generation with controllable latent spaces. Our experimental results demonstrate that our approach can achieve high accuracy on downstream tasks while upholding equality of odds.

Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation

TL;DR

This paper tackles enforcing conditional independence to achieve fair and unconfounded representations and enable causal image generation. It recasts CI as an equality of Jensen-Shannon divergences and introduces a dynamic sampling scheme to apply CI in high-dimensional latent spaces, applicable to any encoder. The authors demonstrate that enforcing CI in the latent space (v-space) yields better fairness-accuracy trade-offs than label-space CI and enables race-controlled generation through partial CI in diffusion autoencoders. The approach provides a model-agnostic training paradigm with practical impact on fair predictive modeling and controllable image synthesis in high-dimensional settings.

Abstract

Conditional independence (CI) constraints are critical for defining and evaluating fairness in machine learning, as well as for learning unconfounded or causal representations. Traditional methods for ensuring fairness either blindly learn invariant features with respect to a protected variable (e.g., race when classifying sex from face images) or enforce CI relative to the protected attribute only on the model output (e.g., the sex label). Neither of these methods are effective in enforcing CI in high-dimensional feature spaces. In this paper, we focus on a nascent approach characterizing the CI constraint in terms of two Jensen-Shannon divergence terms, and we extend it to high-dimensional feature spaces using a novel dynamic sampling strategy. In doing so, we introduce a new training paradigm that can be applied to any encoder architecture. We are able to enforce conditional independence of the diffusion autoencoder latent representation with respect to any protected attribute under the equalized odds constraint and show that this approach enables causal image generation with controllable latent spaces. Our experimental results demonstrate that our approach can achieve high accuracy on downstream tasks while upholding equality of odds.
Paper Structure (23 sections, 12 equations, 9 figures, 2 tables)

This paper contains 23 sections, 12 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: We propose a new way to ensure fairness in downstream tasks by enforcing conditional independence constraints on the latent representation. This is achieved by minimizing the Jensen-Shannon divergence (JS) between distributions obtained using a novel dynamic sampling technique. In the setting shown here, we apply our technique to the diffusion autoencoder's semantic representation to disentangle the sensitive attribute of skin type (a proxy variable for race) and perform causal image generation.
  • Figure 2: High-level view of our architecture. We introduce two variants of a conditional independence enforcer that can be added to any off-the-shelf encoder.
  • Figure 3: Synthetic data format and sample. The diagonal kernels are controlled by $\sigma_A$, while the off-diagonals are controlled by $\sigma_B$.
  • Figure 4: Synthetic data causal diagram. We apply the $L_{CI}$ component to either the latent vector space (V) or the label space (Y).
  • Figure 5: Fairness and accuracy metrics versus conditional independence strength $\lambda$. Orange lines correspond to the $y$-space CI-CNN, and blue lines to the $v$-space CI-CNN.
  • ...and 4 more figures