Scalable $k$-clique Densest Subgraph Search

Xiaowei Ye; Miao Qiao; Rong-Hua Li; Qi Zhang; Guoren Wang

Scalable $k$-clique Densest Subgraph Search

Xiaowei Ye, Miao Qiao, Rong-Hua Li, Qi Zhang, Guoren Wang

TL;DR

This work tackles the scalable solution of the $k$-clique densest subgraph search ($k$-$\mathsf{DSS}$) by introducing a SCT-based convex programming formulation ${\mathsf{SCT\text{-}CP}}(G)$ and solving it with a Frank–Wolfe algorithm ${\mathsf{PSCTL}}$, achieving a runtime independent of the number of $k$-cliques $|\mathcal{C}_k(V)|$. To further scale to massive graphs, it introduces a polynomial-time sampling method ${\mathsf{CPSample}}$ that uses ${\mathsf{CCPATH}}$ to sample $k$-cliques uniformly and estimates density with provable accuracy guarantees. Theoretical results link ${\mathsf{SCT\text{-}CP}}(G)$ to near-optimal $k$-DSS solutions, providing bounds such as $\rho_k(H(r^*)) \ge (1 - 1/(k|V^*|)) \rho_k(V^*)$, and the sampling method comes with Chernoff-based guarantees ensuring reliable approximations under mild conditions. Experiments on 12 large real-world graphs show orders-of-magnitude speedups over state-of-the-art methods, with ${\mathsf{CPSample}}$ solving networks with up to $1.8\times 10^9$ edges and providing competitive accuracy, while ${\mathsf{PSCTL}}$ remains highly efficient and memory-friendly due to the SCT-based framework. The authors also release open-source code, enabling reproducibility and practical adoption in large-scale network analysis.

Abstract

In this paper, we present a collection of novel and scalable algorithms designed to tackle the challenges inherent in the $k$-clique densest subgraph problem (\kcdsp) within network analysis. We propose \psctl, a novel algorithm based on the Frank-Wolfe approach for addressing \kcdsp, effectively solving a distinct convex programming problem. \textcolor{black}{\psctl is able to approximate \kcdsp with near optimal guarantees.} The notable advantage of \psctl lies in its time complexity, which is independent of the count of $k$-cliques, resulting in remarkable efficiency in practical applications. Additionally, we present \spath, a sampling-based algorithm with the capability to handle networks on an unprecedented scale, reaching up to $1.8\times 10^9$ edges. By leveraging the \ccpath algorithm as a uniform $k$-clique sampler, \spath ensures the efficient processing of large-scale network data, accompanied by a detailed analysis of accuracy guarantees. Together, these contributions represent a significant advancement in the field of $k$-clique densest subgraph discovery. In experimental evaluations, our algorithms demonstrate orders of magnitude faster performance compared to the current state-of-the-art solutions.

Scalable $k$-clique Densest Subgraph Search

TL;DR

This work tackles the scalable solution of the

-clique densest subgraph search (

) by introducing a SCT-based convex programming formulation

and solving it with a Frank–Wolfe algorithm

, achieving a runtime independent of the number of

-cliques

. To further scale to massive graphs, it introduces a polynomial-time sampling method

that uses

to sample

-cliques uniformly and estimates density with provable accuracy guarantees. Theoretical results link

to near-optimal

-DSS solutions, providing bounds such as

, and the sampling method comes with Chernoff-based guarantees ensuring reliable approximations under mild conditions. Experiments on 12 large real-world graphs show orders-of-magnitude speedups over state-of-the-art methods, with

solving networks with up to

edges and providing competitive accuracy, while

remains highly efficient and memory-friendly due to the SCT-based framework. The authors also release open-source code, enabling reproducibility and practical adoption in large-scale network analysis.

Abstract

In this paper, we present a collection of novel and scalable algorithms designed to tackle the challenges inherent in the

-clique densest subgraph problem (\kcdsp) within network analysis. We propose \psctl, a novel algorithm based on the Frank-Wolfe approach for addressing \kcdsp, effectively solving a distinct convex programming problem. \textcolor{black}{\psctl is able to approximate \kcdsp with near optimal guarantees.} The notable advantage of \psctl lies in its time complexity, which is independent of the count of

-cliques, resulting in remarkable efficiency in practical applications. Additionally, we present \spath, a sampling-based algorithm with the capability to handle networks on an unprecedented scale, reaching up to

edges. By leveraging the \ccpath algorithm as a uniform

-clique sampler, \spath ensures the efficient processing of large-scale network data, accompanied by a detailed analysis of accuracy guarantees. Together, these contributions represent a significant advancement in the field of

-clique densest subgraph discovery. In experimental evaluations, our algorithms demonstrate orders of magnitude faster performance compared to the current state-of-the-art solutions.

Paper Structure (14 sections, 23 theorems, 4 equations, 8 figures, 6 tables, 4 algorithms)

This paper contains 14 sections, 23 theorems, 4 equations, 8 figures, 6 tables, 4 algorithms.

Introduction
Preliminaries
New Convex Programming for $k$-DSS
SCT-based Convex Programming
FW-based Algorithm for SCT-CP(G)
Analysis of the Algorithm
New Sampling-Based Algorithm
The $\mathsf{CPSample}$ algorithm
Analysis of the Algorithm
Experiments
Results of the FW-based algorihtms
Results of the Sampling-based algorihtms
related work
conclusion

Key Result

lemma 1

Consider a $k$-clique $C$ and let $x$ be the node in $C$ with the smallest ranking and $y$ the node with the largest ranking. Either $r(y) = r(x)$ or $\alpha_y^C = 0$.

Figures (8)

Figure 1: The three-step paradigm for $k$-$\mathsf{DSS}$.
Figure 2: Illustration of the SCT.
Figure 3: Illustration of $\mathsf{PSCTL}$ on the example graph for one iteration.
Figure 4: Running time of the Frank-Wolfe based algorithms ($T=10$).
Figure 5: Running time of different Frank-Wolfe based algorithms with varying $T$.
...and 3 more figures

Theorems & Definitions (25)

definition 1: $k$-clique Densest Subgraph
lemma 1: kclpp
lemma 2: kclppDanisch17
lemma 3: PIVOTER
lemma 4
lemma 5: PIVOTER
lemma 6
theorem 1
definition 2
theorem 2
...and 15 more

Scalable $k$-clique Densest Subgraph Search

TL;DR

Abstract

Scalable $k$-clique Densest Subgraph Search

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (25)