Finding Densest Subgraphs with Edge-Color Constraints
Lutz Oettershagen, Honglian Wang, Aristides Gionis
TL;DR
This work studies densest subgraphs under edge-color constraints, formalizing edge-colored DSP with per-color quotas h and variants for at least, at most, and exactly colored edges. It establishes NP-completeness for the decision versions (even with two colors) and provides linear-time constant-factor approximations for the at least h colored-edges variant in everywhere sparse graphs, along with a related non-colored variant. The methods extend to graphs with multiple edge colors via a multi-graph transformation, preserving constant-factor guarantees. Experiments on real networks demonstrate strong practical performance and scalability, including a diverse coauthorship use case that highlights the benefits of edge diversity in dense communities.
Abstract
We consider a variant of the densest subgraph problem in networks with single or multiple edge attributes. For example, in a social network, the edge attributes may describe the type of relationship between users, such as friends, family, or acquaintances, or different types of communication. For conceptual simplicity, we view the attributes as edge colors. The new problem we address is to find a diverse densest subgraph that fulfills given requirements on the numbers of edges of specific colors. When searching for a dense social network community, our problem will enforce the requirement that the community is diverse according to criteria specified by the edge attributes. We show that the decision versions for finding exactly, at most, and at least $\textbf{h}$ colored edges densest subgraph, where $\textbf{h}$ is a vector of color requirements, are NP-complete, for already two colors. For the problem of finding a densest subgraph with at least $\textbf{h}$ colored edges, we provide a linear-time constant-factor approximation algorithm when the input graph is sparse. On the way, we introduce the related at least $h$ (non-colored) edges densest subgraph problem, show its hardness, and also provide a linear-time constant-factor approximation. In our experiments, we demonstrate the efficacy and efficiency of our new algorithms.
