Aggregating maximal cliques in real-world graphs
Noga Alon, Sabyasachi Basu, Shweta Jain, Haim Kaplan, Jakub Łącki, Blair D. Sullivan
TL;DR
This work introduces ρ-dense aggregators to succinctly summarize maximal clique structure by covering all maximal cliques with a small set of dense clusters. It provides a degeneracy-aware algorithm, CliqueAgg, achieving subexponential aggregator sizes for any ρ<1 and near-linear performance on graphs with bounded degeneracy, along with a matching lower bound. Empirically, aggregators yield substantial speedups over state-of-the-art clique enumeration, while producing far more compact representations and reducing cluster overlap. The results suggest that dense-cluster aggregators offer a practical and scalable alternative to exhaustive maximal clique enumeration in real-world networks.
Abstract
Maximal clique enumeration is a fundamental graph mining task, but its utility is often limited by computational intractability and highly redundant output. To address these challenges, we introduce \emph{$ρ$-dense aggregators}, a novel approach that succinctly captures maximal clique structure. Instead of listing all cliques, we identify a small collection of clusters with edge density at least $ρ$ that collectively contain every maximal clique. In contrast to maximal clique enumeration, we prove that for all $ρ< 1$, every graph admits a $ρ$-dense aggregator of \emph{sub-exponential} size, $n^{O(\log_{1/ρ}n)}$, and provide an algorithm achieving this bound. For graphs with bounded degeneracy, a typical characteristic of real-world networks, our algorithm runs in near-linear time and produces near-linear size aggregators. We also establish a matching lower bound on aggregator size, proving our results are essentially tight. In an empirical evaluation on real-world networks, we demonstrate significant practical benefits for the use of aggregators: our algorithm is consistently faster than the state-of-the-art clique enumeration algorithm, with median speedups over $6\times$ for $ρ=0.1$ (and over $300\times$ in an extreme case), while delivering a much more concise structural summary.
