Table of Contents
Fetching ...

A Deep Latent Factor Graph Clustering with Fairness-Utility Trade-off Perspective

Siamak Ghodsi, Amjad Seyedi, Tai Le Quy, Fariba Karimi, Eirini Ntoutsi

TL;DR

This work tackles fair graph clustering by integrating a soft demographic-balance constraint directly into an end-to-end deep nonnegative tri-factorization (DFNMF) framework. The model builds a deep hierarchical representation $\mathbf{\Psi}=\mathbf{H}_1\cdots\mathbf{H}_p$ coupled with a final interaction matrix $\mathbf{W}_p$, and optimizes a unified objective that adds a fairness penalty $\lambda \|\mathbf{F}^\top\mathbf{\Psi}\|_F^2$ to the graph reconstruction term. This yields explicit control over the utility–fairness trade-off with a single parameter $\lambda$, enabling soft, interpretable cluster memberships without post-processing. DFNMF scales near-linearly with the number of edges using CSR-based sparse operations and alternating updates, and experiments on synthetic and real networks show it often dominates state-of-the-art baselines on Pareto fronts for modularity and demographic balance, while remaining interpretable through its hierarchical factors. The approach enables robust, scalable, fairness-aware graph clustering with practical implications for diverse domains where balanced representation is essential.

Abstract

Fair graph clustering seeks partitions that respect network structure while maintaining proportional representation across sensitive groups, with applications spanning community detection, team formation, resource allocation, and social network analysis. Many existing approaches enforce rigid constraints or rely on multi-stage pipelines (e.g., spectral embedding followed by $k$-means), limiting trade-off control, interpretability, and scalability. We introduce \emph{DFNMF}, an end-to-end deep nonnegative tri-factorization tailored to graphs that directly optimizes cluster assignments with a soft statistical-parity regularizer. A single parameter $λ$ tunes the fairness--utility balance, while nonnegativity yields parts-based factors and transparent soft memberships. The optimization uses sparse-friendly alternating updates and scales near-linearly with the number of edges. Across synthetic and real networks, DFNMF achieves substantially higher group balance at comparable modularity, often dominating state-of-the-art baselines on the Pareto front. The code is available at https://github.com/SiamakGhodsi/DFNMF.git.

A Deep Latent Factor Graph Clustering with Fairness-Utility Trade-off Perspective

TL;DR

This work tackles fair graph clustering by integrating a soft demographic-balance constraint directly into an end-to-end deep nonnegative tri-factorization (DFNMF) framework. The model builds a deep hierarchical representation coupled with a final interaction matrix , and optimizes a unified objective that adds a fairness penalty to the graph reconstruction term. This yields explicit control over the utility–fairness trade-off with a single parameter , enabling soft, interpretable cluster memberships without post-processing. DFNMF scales near-linearly with the number of edges using CSR-based sparse operations and alternating updates, and experiments on synthetic and real networks show it often dominates state-of-the-art baselines on Pareto fronts for modularity and demographic balance, while remaining interpretable through its hierarchical factors. The approach enables robust, scalable, fairness-aware graph clustering with practical implications for diverse domains where balanced representation is essential.

Abstract

Fair graph clustering seeks partitions that respect network structure while maintaining proportional representation across sensitive groups, with applications spanning community detection, team formation, resource allocation, and social network analysis. Many existing approaches enforce rigid constraints or rely on multi-stage pipelines (e.g., spectral embedding followed by -means), limiting trade-off control, interpretability, and scalability. We introduce \emph{DFNMF}, an end-to-end deep nonnegative tri-factorization tailored to graphs that directly optimizes cluster assignments with a soft statistical-parity regularizer. A single parameter tunes the fairness--utility balance, while nonnegativity yields parts-based factors and transparent soft memberships. The optimization uses sparse-friendly alternating updates and scales near-linearly with the number of edges. Across synthetic and real networks, DFNMF achieves substantially higher group balance at comparable modularity, often dominating state-of-the-art baselines on the Pareto front. The code is available at https://github.com/SiamakGhodsi/DFNMF.git.

Paper Structure

This paper contains 59 sections, 1 theorem, 33 equations, 8 figures, 12 tables, 1 algorithm.

Key Result

Lemma 1

Let $\bm{H}$ be a nonnegative, column-stochastic matrix. Then $\bm{F}^\top\bm{H} = \bm{0}$ of Definition def:fair_nmf, is equivalent to the fairness condition in Eq. eq:org_bal.

Figures (8)

  • Figure 1: Fair clustering of a 16-node graph (10 Male, 6 Female) into two equal-sized clusters. Left: Utilitarian clustering yields a structure-driven partition with a 6M:2F distribution for green and 4M:4F for lavender cluster, resulting in gender imbalance. Right: Fair clustering achieves a balanced 5M:3F distribution in both clusters by swapping memberships of nodes 1 and 13.
  • Figure 2: DFNMF schematic and example. A 45-node graph with imbalanced gender distribution of 40%/60%(27 , 18 ) is factorized through $\bm{H}_1,\bm{H}_2,\bm{H}_3$. Two solutions illustrate the effect of $\lambda$: small $\lambda$ preserves structure but yields imbalance (5:9, 5:11, 8:7); large $\lambda$ improves parity (7:11, 5:7, 6:9), highlighting the utility–fairness trade-off.
  • Figure 3: SBM networks with varying node sizes: comparison of clustering and fairness metrics. Arrows ($\uparrow$/$\downarrow$) indicate whether higher/lower is better.
  • Figure 4: Convergence curves on SBM graphs (5K and 10K nodes), comparing DFNMF with/without fairness regularization ($\lambda = 10$ vs. $\lambda = 0$). Both variants converge rapidly; the fair version yields a higher loss due to the added fairness penalty.
  • Figure 5: DFNMF hierarchy on a 60-node graph. (a) Input graph; node shapes denote sensitive groups. (b) Micro-clusters (A–L) discovered by the first layer ($\bm H_1$). (c) Three coarse communities obtained by aggregating micro-clusters via $\bm H_1$; final node–community affinities are $\bm{\Psi}=\bm H_1\bm H_2$.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Definition 1: Generalized Demographic Balance
  • Definition 2: Demographic Group Encoding
  • Definition 3: Soft Balanced Fairness
  • Lemma 1: Equivalence to Demographic Balance
  • proof