Table of Contents
Fetching ...

Fairness-Aware Dense Subgraph Discovery

Emmanouil Kariotakis, Nicholas D. Sidiropoulos, Aritra Konar

TL;DR

This paper tackles fairness-aware densest subgraph discovery by introducing two tractable formulations, FADSG-I and FADSG-II, that incorporate two distinct regularizers to trade off subgraph density against representation of a protected vertex group. By enabling a regularization path over the parameter λ, the authors can trace Pareto-optimal frontiers between density and fairness and compute the price of fairness (PoF) for targeted representations. They establish polynomial-time solvability (via max-flow constructions) and adopt a scalable Super-Greedy++ based algorithm to approximate solutions, enabling practical analysis on large graphs. Experimental results across real-world datasets show that FADSG-I often achieves lower PoF than prior methods while providing multiple fairness levels, and FADSG-II offers flexible control over protected-vertex inclusion with competitive density. Collectively, the work advances fair subgraph discovery by delivering tractable, tunable formulations and rigorous trade-off analysis, with potential impact on diverse social and network applications.

Abstract

Dense subgraph discovery (DSD) is a key graph mining primitive with myriad applications including finding densely connected communities which are diverse in their vertex composition. In such a context, it is desirable to extract a dense subgraph that provides fair representation of the diverse subgroups that constitute the vertex set while incurring a small loss in terms of subgraph density. Existing methods for promoting fairness in DSD have important limitations - the associated formulations are NP-hard in the worst case and they do not provide flexible notions of fairness, making it non-trivial to analyze the inherent trade-off between density and fairness. In this paper, we introduce two tractable formulations for fair DSD, each offering a different notion of fairness. Our methods provide a structured and flexible approach to incorporate fairness, accommodating varying fairness levels. We introduce the fairness-induced relative loss in subgraph density as a price of fairness measure to quantify the associated trade-off. We are the first to study such a notion in the context of detecting fair dense subgraphs. Extensive experiments on real-world datasets demonstrate that our methods not only match but frequently outperform existing solutions, sometimes incurring even less than half the subgraph density loss compared to prior art, while achieving the target fairness levels. Importantly, they excel in scenarios that previous methods fail to adequately handle, i.e., those with extreme subgroup imbalances, highlighting their effectiveness in extracting fair and dense solutions.

Fairness-Aware Dense Subgraph Discovery

TL;DR

This paper tackles fairness-aware densest subgraph discovery by introducing two tractable formulations, FADSG-I and FADSG-II, that incorporate two distinct regularizers to trade off subgraph density against representation of a protected vertex group. By enabling a regularization path over the parameter λ, the authors can trace Pareto-optimal frontiers between density and fairness and compute the price of fairness (PoF) for targeted representations. They establish polynomial-time solvability (via max-flow constructions) and adopt a scalable Super-Greedy++ based algorithm to approximate solutions, enabling practical analysis on large graphs. Experimental results across real-world datasets show that FADSG-I often achieves lower PoF than prior methods while providing multiple fairness levels, and FADSG-II offers flexible control over protected-vertex inclusion with competitive density. Collectively, the work advances fair subgraph discovery by delivering tractable, tunable formulations and rigorous trade-off analysis, with potential impact on diverse social and network applications.

Abstract

Dense subgraph discovery (DSD) is a key graph mining primitive with myriad applications including finding densely connected communities which are diverse in their vertex composition. In such a context, it is desirable to extract a dense subgraph that provides fair representation of the diverse subgroups that constitute the vertex set while incurring a small loss in terms of subgraph density. Existing methods for promoting fairness in DSD have important limitations - the associated formulations are NP-hard in the worst case and they do not provide flexible notions of fairness, making it non-trivial to analyze the inherent trade-off between density and fairness. In this paper, we introduce two tractable formulations for fair DSD, each offering a different notion of fairness. Our methods provide a structured and flexible approach to incorporate fairness, accommodating varying fairness levels. We introduce the fairness-induced relative loss in subgraph density as a price of fairness measure to quantify the associated trade-off. We are the first to study such a notion in the context of detecting fair dense subgraphs. Extensive experiments on real-world datasets demonstrate that our methods not only match but frequently outperform existing solutions, sometimes incurring even less than half the subgraph density loss compared to prior art, while achieving the target fairness levels. Importantly, they excel in scenarios that previous methods fail to adequately handle, i.e., those with extreme subgroup imbalances, highlighting their effectiveness in extracting fair and dense solutions.

Paper Structure

This paper contains 35 sections, 10 theorems, 61 equations, 20 figures, 7 tables, 1 algorithm.

Key Result

Lemma 4

$r_2(\mathcal{S}) = 1 \Leftrightarrow \frac{|\mathcal{S} \cap \mathcal{S}_p|}{|\mathcal{S}_p|} = \frac{1}{2}$.

Figures (20)

  • Figure 1: A toy example of Fair DSG with $2$ vertex groups; protected (red) and unprotected (blue). Left: Densest subgraph without fairness constraints; density $=2$. Right: Densest subgraph with fairness constraints (equal number of red and blue vertices); density $=1.875$.
  • Figure 2: Comparing perfectly balanced fair subgraphs obtained via prior art and FADSG-I (ours, purple) on 4 different Twitch datasets. (Top): Price of fairness (the lower, the better). (Bottom): Fraction of protected vertices in induced subgraph (set to $50\%$ for perfect balance).
  • Figure 3: The "lollipop" graph for $n=16$. The unprotected set (blue) forms a clique on 4 vertices whereas the protected set (red) forms a path graph on 12 vertices.
  • Figure 4: Comparing induced perfectly balanced fair subsets of prior art and FADSG-I (Ours), on different Amazon datasets. Top: PoF. Bottom: Fraction of protected vertices in induced subsets, $r_1(\mathcal{S})$.
  • Figure 5: The fraction of protected vertices in the induced subset, $r_1(\mathcal{S})$, as function of $\lambda$, for FADSG-I. Left: Amazon hpc - Middle: Amazon op - Right: Twitch ptbr.
  • ...and 15 more figures

Theorems & Definitions (22)

  • Definition 1: DSG
  • Definition 2: FADSG-I
  • Definition 3: FADSG-II
  • Lemma 4
  • Lemma 5
  • Proposition 6
  • Proposition 7
  • proof
  • proof
  • Lemma 8
  • ...and 12 more