Densest Subhypergraph: Negative Supermodular Functions and Strongly Localized Methods
Yufan Huang, David F. Gleich, Nate Veldt
TL;DR
This work advances localized densest subgraph analysis by generalizing to hypergraphs and incorporating a tunable locality parameter $\varepsilon$, enabling seed-centered outputs $S$ around a seed set $R$ with objective forms like $\frac{e[S]-\varepsilon\mathrm{Vol}(S\cap\bar{R})}{|S|}$ and variants. It provides both global and localized algorithms: a strongly polynomial density-improvement framework for the general Densest Supermodular Subset problem with possible negative values, and a flow-based exact method plus a strongly-local flow algorithm for anchored densest subhypergraphs, valid when $\varepsilon\ge 1$. The paper proves fundamental results for extending DSS to nonnegative and non-standard value functions, constructs a per-hyperedge gadget to reduce ADSH to hypergraph $s$-$t$ cuts, and demonstrates practical performance on web-domain and planted-density datasets, with clear speedups over traditional binary-search approaches. Overall, it delivers new theoretical tools and scalable algorithms for discovering dense, localized structures in hypergraphs, with applications to real-time web analysis and data-store optimization. The results enable strongly local computation whose runtime depends on seed-region size rather than the entire input, enabling scalable, seed-driven exploration of dense substructures. $\,$
Abstract
Dense subgraph discovery is a fundamental primitive in graph and hypergraph analysis which among other applications has been used for real-time story detection on social media and improving access to data stores of social networking systems. We present several contributions for localized densest subgraph discovery, which seeks dense subgraphs located nearby given seed sets of nodes. We first introduce a generalization of a recent $\textit{anchored densest subgraph}$ problem, extending this previous objective to hypergraphs and also adding a tunable locality parameter that controls the extent to which the output set overlaps with seed nodes. Our primary technical contribution is to prove when it is possible to obtain a strongly-local algorithm for solving this problem, meaning that the runtime depends only on the size of the input set. We provide a strongly-local algorithm that applies whenever the locality parameter is not too small, and show via counterexample why strongly-local algorithms are impossible below a certain threshold. Along the way to proving our results for localized densest subgraph discovery, we also provide several advances in solving global dense subgraph discovery objectives. This includes the first strongly polynomial time algorithm for the densest supermodular set problem and a flow-based exact algorithm for a heavy and dense subgraph discovery problem in graphs with arbitrary node weights. We demonstrate our algorithms on several web-based data analysis tasks.
