Table of Contents
Fetching ...

Euclidean k-center Fair Clusterings

Ayano Moritaka, Shin-ichi Nakano, Kento Tanaka, Noriaki Yoshida

TL;DR

The paper addresses fair clustering on the plane by imposing per-color lower and upper bounds within each cluster, and seeks an optimal partition of $P$ into $k$ clusters with minimum radius. It solves this by reducing to a sequence of Euclidean $k$-center $(\ell,u)$-clustering subproblems and applying a maximum-flow-with-lower-bounds framework, aided by geometric observations that bound candidate disks. The main result is a polynomial-time algorithm with running time $O(|C|n^{2k+4})$ for fixed $k$, enabling exact computation of fair clusterings even when points have multiple colors. This contributes a rigorous, exact approach to fair clustering in planar data, with potential for heuristics, higher-dimensional generalization, and alternative fairness formulations as future work.

Abstract

Many approximation algorithms and heuristic algorithms to find a fair clustering have emerged. In this paper we define a new and natural variant of fair clustering problem and design a polynomial time algorithm to compute an optimal fair clustering. Let P be a set of n points on a plane, and each point has a color in C, corresponding to a group. For each color q in C, a lower bound l(q) and an upper bound u(q) are given. Then we define the fair clustering problem as follows. The fair k-clustering problem is to find a partition of P into a set of k clusters with a minimum cost such that each cluster contains at least l(q) and at most u(q) points in P with color q. By l(q) and u(q) each cluster cannot contain too few or too many points with a specific color. If we regard a color to a gender or a minority ethnic group, the clustering corresponds to a fair clustering.

Euclidean k-center Fair Clusterings

TL;DR

The paper addresses fair clustering on the plane by imposing per-color lower and upper bounds within each cluster, and seeks an optimal partition of into clusters with minimum radius. It solves this by reducing to a sequence of Euclidean -center -clustering subproblems and applying a maximum-flow-with-lower-bounds framework, aided by geometric observations that bound candidate disks. The main result is a polynomial-time algorithm with running time for fixed , enabling exact computation of fair clusterings even when points have multiple colors. This contributes a rigorous, exact approach to fair clustering in planar data, with potential for heuristics, higher-dimensional generalization, and alternative fairness formulations as future work.

Abstract

Many approximation algorithms and heuristic algorithms to find a fair clustering have emerged. In this paper we define a new and natural variant of fair clustering problem and design a polynomial time algorithm to compute an optimal fair clustering. Let P be a set of n points on a plane, and each point has a color in C, corresponding to a group. For each color q in C, a lower bound l(q) and an upper bound u(q) are given. Then we define the fair clustering problem as follows. The fair k-clustering problem is to find a partition of P into a set of k clusters with a minimum cost such that each cluster contains at least l(q) and at most u(q) points in P with color q. By l(q) and u(q) each cluster cannot contain too few or too many points with a specific color. If we regard a color to a gender or a minority ethnic group, the clustering corresponds to a fair clustering.

Paper Structure

This paper contains 4 sections, 2 theorems, 5 figures, 1 algorithm.

Key Result

Theorem 1

One can solve the $k$-center $(\ell,u)$-clustering problem on a plane in $O(n^{2k+4})$ time.

Figures (5)

  • Figure 1: (a) A non-fair clustering and (b) a fair clustering.
  • Figure 2: A fair clustering with $\ell(red)=3$, $u(red)=4$, $\ell(blue)=1$, $u(blue)=2$, $\ell(green)=2$ and $u(green)=3$.
  • Figure 3: (a) A disk $D$ having three points in $P$ on the boundary, and (b) a disk having two points in $P$ on the boundary with the distance equal to the diameter.
  • Figure 4: Each two points in $P$ possibly define the two disks having the two points on the boundary and with the radius same to $D$.
  • Figure 5: A network derived from a Euclidean $k$-center $(\ell,u)$-clustering problem.

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2