From Discrete to Continuous: Deep Fair Clustering With Transferable Representations

Xiang Zhang

From Discrete to Continuous: Deep Fair Clustering With Transferable Representations

Xiang Zhang

TL;DR

This work tackles deep fair clustering for both discrete and continuous sensitive attributes by learning latent representations $Z$ through an autoencoder and optimizing an information-bottleneck objective that balances clustering utility, data fidelity, and fairness. By maximizing $I(Z;C)$ and $I(Z;X)$ while minimizing $I(Z;G)$, and by bounding $I(Z;G)$ with a variational CLUB-based estimator, the method yields fair, transferable representations that remain effective for downstream tasks such as few-shot classification. The authors unify existing fairness criteria under the mutual information framework and demonstrate competitive performance on discrete attributes, improved fairness on continuous attributes, and strong transferability of the learned representations across diverse datasets. This approach promises practically relevant fairness guarantees for downstream applications even when the clustering outputs themselves are not explicitly constrained for fairness.

Abstract

We consider the problem of deep fair clustering, which partitions data into clusters via the representations extracted by deep neural networks while hiding sensitive data attributes. To achieve fairness, existing methods present a variety of fairness-related objective functions based on the group fairness criterion. However, these works typically assume that the sensitive attributes are discrete and do not work for continuous sensitive variables, such as the proportion of the female population in an area. Besides, the potential of the representations learned from clustering tasks to improve performance on other tasks is ignored by existing works. In light of these limitations, we propose a flexible deep fair clustering method that can handle discrete and continuous sensitive attributes simultaneously. Specifically, we design an information bottleneck style objective function to learn fair and clustering-friendly representations. Furthermore, we explore for the first time the transferability of the extracted representations to other downstream tasks. Unlike existing works, we impose fairness at the representation level, which could guarantee fairness for the transferred task regardless of clustering results. To verify the effectiveness of the proposed method, we perform extensive experiments on datasets with discrete and continuous sensitive attributes, demonstrating the advantage of our method in comparison with state-of-the-art methods.

From Discrete to Continuous: Deep Fair Clustering With Transferable Representations

TL;DR

This work tackles deep fair clustering for both discrete and continuous sensitive attributes by learning latent representations

through an autoencoder and optimizing an information-bottleneck objective that balances clustering utility, data fidelity, and fairness. By maximizing

and

while minimizing

, and by bounding

with a variational CLUB-based estimator, the method yields fair, transferable representations that remain effective for downstream tasks such as few-shot classification. The authors unify existing fairness criteria under the mutual information framework and demonstrate competitive performance on discrete attributes, improved fairness on continuous attributes, and strong transferability of the learned representations across diverse datasets. This approach promises practically relevant fairness guarantees for downstream applications even when the clustering outputs themselves are not explicitly constrained for fairness.

Abstract

Paper Structure (25 sections, 4 theorems, 22 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 4 theorems, 22 equations, 7 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Fair Clustering
Fair Representation Learning
Problem Formulation and Motivations
Problem Formulation
Unifying Existing Methods Via Mutual Information
Limitations and Motivations
The Proposed Deep Fair Clustering Method
Fair Clustering Reformulation
Objective Functions
Framework overview
$I(Z;C)$
$I(Z;X)$
$I(Z;G)$
...and 10 more sections

Key Result

Proposition 1

The fairness criteria in Table table-fairness-metric are equivalent to finding a clustering that satisfies $I(C;G) = 0$.

Figures (7)

Figure 1: We group the Census dataset into two clusters without considering fairness. The figure depicts the distributions of the proportion of the female population in the two clusters.
Figure 2: The illustration of the proposed method.
Figure 3: The results of visualization of the representations on MNIST-USPS datasets. (a) The left three figures are colored by classes, and (b) the right three figures are colored by groups.
Figure 4: The effect of $\alpha$ and $\beta$ on $\mathrm{ACC}$ and $\mathrm{MNCE}$ of MNIST-USPS dataset
Figure 5: Ablation study on the Census dataset
...and 2 more figures

Theorems & Definitions (10)

Proposition 1
proof
Definition 1
Definition 2
Proposition 2
proof
Remark 1
Remark 2
Proposition 3
Lemma 1

From Discrete to Continuous: Deep Fair Clustering With Transferable Representations

TL;DR

Abstract

From Discrete to Continuous: Deep Fair Clustering With Transferable Representations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (10)