Table of Contents
Fetching ...

Aggregating maximal cliques in real-world graphs

Noga Alon, Sabyasachi Basu, Shweta Jain, Haim Kaplan, Jakub Łącki, Blair D. Sullivan

TL;DR

This work introduces ρ-dense aggregators to succinctly summarize maximal clique structure by covering all maximal cliques with a small set of dense clusters. It provides a degeneracy-aware algorithm, CliqueAgg, achieving subexponential aggregator sizes for any ρ<1 and near-linear performance on graphs with bounded degeneracy, along with a matching lower bound. Empirically, aggregators yield substantial speedups over state-of-the-art clique enumeration, while producing far more compact representations and reducing cluster overlap. The results suggest that dense-cluster aggregators offer a practical and scalable alternative to exhaustive maximal clique enumeration in real-world networks.

Abstract

Maximal clique enumeration is a fundamental graph mining task, but its utility is often limited by computational intractability and highly redundant output. To address these challenges, we introduce \emph{$ρ$-dense aggregators}, a novel approach that succinctly captures maximal clique structure. Instead of listing all cliques, we identify a small collection of clusters with edge density at least $ρ$ that collectively contain every maximal clique. In contrast to maximal clique enumeration, we prove that for all $ρ< 1$, every graph admits a $ρ$-dense aggregator of \emph{sub-exponential} size, $n^{O(\log_{1/ρ}n)}$, and provide an algorithm achieving this bound. For graphs with bounded degeneracy, a typical characteristic of real-world networks, our algorithm runs in near-linear time and produces near-linear size aggregators. We also establish a matching lower bound on aggregator size, proving our results are essentially tight. In an empirical evaluation on real-world networks, we demonstrate significant practical benefits for the use of aggregators: our algorithm is consistently faster than the state-of-the-art clique enumeration algorithm, with median speedups over $6\times$ for $ρ=0.1$ (and over $300\times$ in an extreme case), while delivering a much more concise structural summary.

Aggregating maximal cliques in real-world graphs

TL;DR

This work introduces ρ-dense aggregators to succinctly summarize maximal clique structure by covering all maximal cliques with a small set of dense clusters. It provides a degeneracy-aware algorithm, CliqueAgg, achieving subexponential aggregator sizes for any ρ<1 and near-linear performance on graphs with bounded degeneracy, along with a matching lower bound. Empirically, aggregators yield substantial speedups over state-of-the-art clique enumeration, while producing far more compact representations and reducing cluster overlap. The results suggest that dense-cluster aggregators offer a practical and scalable alternative to exhaustive maximal clique enumeration in real-world networks.

Abstract

Maximal clique enumeration is a fundamental graph mining task, but its utility is often limited by computational intractability and highly redundant output. To address these challenges, we introduce \emph{-dense aggregators}, a novel approach that succinctly captures maximal clique structure. Instead of listing all cliques, we identify a small collection of clusters with edge density at least that collectively contain every maximal clique. In contrast to maximal clique enumeration, we prove that for all , every graph admits a -dense aggregator of \emph{sub-exponential} size, , and provide an algorithm achieving this bound. For graphs with bounded degeneracy, a typical characteristic of real-world networks, our algorithm runs in near-linear time and produces near-linear size aggregators. We also establish a matching lower bound on aggregator size, proving our results are essentially tight. In an empirical evaluation on real-world networks, we demonstrate significant practical benefits for the use of aggregators: our algorithm is consistently faster than the state-of-the-art clique enumeration algorithm, with median speedups over for (and over in an extreme case), while delivering a much more concise structural summary.

Paper Structure

This paper contains 24 sections, 16 theorems, 2 equations, 6 figures, 6 tables, 2 algorithms.

Key Result

lemma 1

For every recursive call of CliqueAggRec$(G,H,\rho,C, X)$ made by CliqueAgg, the following invariants hold:

Figures (6)

  • Figure 1: A graph with $8$ maximal cliques (each maximal clique is a triangle), and an $0.8$-dense aggregator $S=\{S_1, S_2\}$, where $S_1 = \{1, 2, 5, 6, 7\}$ and $S_2 = \{2, 3, 4, 5, 8\}$. $density(S_1)$ and $density(S_2) \geq 0.8$, and for every clique $C$ in the graph, either $C \subseteq S_1$ or $C \subseteq S_2$.
  • Figure 2: The variation in aggregator size as the density threshold increases; we normalize the cluster count by the number of edges.
  • Figure 3: Aggregator size as a fraction of the maximal clique count, at three density threshold. For $\rho=1.0$, the implied bar has height 1.0 for all datasets. Datasets for which QC did not terminate are excluded from this plot.
  • Figure 4: The number of clusters each vertex belongs to (defined as membership distributions in Section \ref{['sec:redundancy']}) in aggregators of different thresholds for two datasets: skitter (top) and web-BerkStan(bottom).
  • Figure 5: The increase in cluster count in our aggregators at different thresholds when pruning is turned off.
  • ...and 1 more figures

Theorems & Definitions (18)

  • definition 1: seidman1983network
  • definition 2
  • lemma 1
  • theorem 1
  • theorem 2
  • corollary 1
  • theorem 3
  • theorem 4
  • lemma 2
  • theorem 4
  • ...and 8 more