Fair densest subgraph across multiple graphs
Chamalee Wickrama Arachchi, Nikolaj Tatti
TL;DR
The paper addresses the problem of finding fair densest subgraphs across multiple graph snapshots by introducing two constrained variants: Fair Densest Subgraph (FDS), which maximizes the sum of snapshot densities under a bound on density disparity $\Delta$, and Smallest Difference Subgraph (SDS), which minimizes $\Delta$ while ensuring a minimum total density $\sigma$. It proves NP-hardness for both problems (with SDS also being inapproximable) and provides exact integer programming approaches (FDS-IP, SDS-IP, MDS-IP) together with polynomial-time heuristics (FDS-Grd, SDS-Grd, MDS-Grd). The authors evaluate performance on synthetic datasets with ground truth and real-world temporal networks, demonstrating recovery of fair dense components and practical usefulness via case studies. Overall, the work offers a rigorous framework for fair cross-snapshot densest subgraph discovery and provides tools that trade off fairness and density in multi-graph settings.
Abstract
Many real-world networks can be modeled as graphs. Finding dense subgraphs is a key problem in graph mining with applications in diverse domains. In this paper, we consider two variants of the densest subgraph problem where multiple graph snapshots are given and the goal is to find a fair densest subgraph without over-representing the density among the graph snapshots. More formally, given a set of graphs and input parameter $α$, we find a dense subgraph maximizing the sum of densities across snapshots such that the difference between the maximum and minimum induced density is at most $α$. We prove that this problem is NP-hard and present an integer programming based, exact algorithm and a practical polynomial-time heuristic. We also consider a minimization variant where given an input parameter $σ$, we find a dense subgraph which minimizes the difference between the maximum and minimum density while inducing a total density of at least $σ$ across the graph snapshots. We prove the NP-hardness of the problem and propose two algorithms: an exponential time algorithm based on integer programming and a greedy algorithm. We present an extensive experimental study that shows that our algorithms can find the ground truth in synthetic dataset and produce good results in real-world datasets. Finally, we present case studies that show the usefulness of our problem.
