Robust Fair Clustering with Group Membership Uncertainty Sets

Sharmila Duppala; Juan Luque; John P. Dickerson; Seyed A. Esmaeili

Robust Fair Clustering with Group Membership Uncertainty Sets

Sharmila Duppala, Juan Luque, John P. Dickerson, Seyed A. Esmaeili

TL;DR

This paper introduces a simple noise model that requires a small number of parameters to be given by the decision maker and presents an algorithm for fair clustering with provable robustness guarantees.

Abstract

We study the canonical fair clustering problem where each cluster is constrained to have close to population-level representation of each group. Despite significant attention, the salient issue of having incomplete knowledge about the group membership of each point has been superficially addressed. In this paper, we consider a setting where the assigned group memberships are noisy. We introduce a simple noise model that requires a small number of parameters to be given by the decision maker. We then present an algorithm for fair clustering with provable \emph{robustness} guarantees. Our framework enables the decision maker to trade off between the robustness and the clustering quality. Unlike previous work, our algorithms are backed by worst-case theoretical guarantees. Finally, we empirically verify the performance of our algorithm on real world datasets and show its superior performance over existing baselines.

Robust Fair Clustering with Group Membership Uncertainty Sets

TL;DR

Abstract

Paper Structure (22 sections, 13 theorems, 47 equations, 12 figures, 2 algorithms)

This paper contains 22 sections, 13 theorems, 47 equations, 12 figures, 2 algorithms.

Introduction
Outline and Contributions:
Additional Related Work
Preliminaries and Previous Noise Models
Preliminaries
Previous Noise Models in Fair Clustering
Our Noise Model and Problem Statement
Algorithm and Theoretical Analysis
Our Algorithm: RobustAlg
Failure When Using a Vanilla Clustering Algorithm
Experiments
Useful Fact
Omitted Proofs
MaxFlow Rounding
More Discussion About Previous Noise Models and Their Algorithms in Fair Clustering
...and 7 more sections

Key Result

Proposition 4.1

For any instance with noise parameters $\{ m_{h}^{+},m_{h}^{-} \}_{h \in \mathcal{H}}$, the group uncertainty set $\mathcal{U}$ is defined as the set of all group assignments $\hat{\chi}:\mathcal{P} \rightarrow \mathcal{H}$ that satisfy the constraints in eq:neg and eq:pos.

Figures (12)

Figure 1: All colorings in the uncertainty set of a toy example with $4$ points and $m_{\text{red}}^-=2, m_{\text{blue}}^-=1$ are shown.
Figure 2: All colorings in the uncertainty set for the same toy example, now with $m = 2$. Note the new color assignments in the bottom row where the two (originally) blue points become red.
Figure 3: Plots for $k$-center objective and fairness violation as $m/n$ increases. Over half of RobustAlg's pictured fairness violations are exactly zero; the rest fall between $10^{-5}$ and $10^{-3}$. The 95% confidence interval around ProbAlg is shaded; however, it is faint because the fairness violations are very sharply concentrated around their plotted mean.
Figure 4: Comparison of the effect of increasing noise, given different noise model configurations.
Figure 5: Toy example showing that for the set of centers $S_{\text{vll}}$ from vanilla $k$-center algorithm, there does not exist a feasible solution to LP \ref{['lp:rfa']}-\ref{['eq:endrfa']} for any arbitrarily large $R$.
...and 7 more figures

Theorems & Definitions (25)

Proposition 4.1
Lemma 5.1
Lemma 5.2
Lemma 5.3
Theorem 5.1
Theorem 5.2
proof
Proposition B.1
proof
proof
...and 15 more

Robust Fair Clustering with Group Membership Uncertainty Sets

TL;DR

Abstract

Robust Fair Clustering with Group Membership Uncertainty Sets

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (25)