Efficient Defective Clique Enumeration and Search with Worst-Case Optimal Search Space
Jihoon Jang, Yehyun Nam, Kunsoo Park, Hyunjoon Kim
TL;DR
The paper tackles the NP-hard problem of enumerating maximal and searching for maximum k-defective cliques in graphs, motivated by noise and incomplete data in real networks. It introduces a clique-first branch-and-bound framework with a novel pivoting technique that achieves worst-case optimal search space bounds, supported by a diameter-two property-based decomposition to further shrink the search. The authors also develop practical enhancements, including an efficient initial-solution computation and graph-reduction techniques, collectively delivering up to four orders of magnitude speedups on large real-world graphs. Extensive experiments demonstrate both theoretical optimality and practical scalability, making the approach competitive for large-scale network analysis tasks such as link prediction and community detection.
Abstract
A $k$-defective clique is a relaxation of the traditional clique definition, allowing up to $k$ missing edges. This relaxation is crucial in various real-world applications such as link prediction, community detection, and social network analysis. Although the problems of enumerating maximal $k$-defective cliques and searching a maximum $k$-defective clique have been extensively studied, existing algorithms suffer from limitations such as the combinatorial explosion of small partial solutions and sub-optimal search spaces. To address these limitations, we propose a novel clique-first branch-and-bound framework that first generates cliques and then adds missing edges. Furthermore, we introduce a new pivoting technique that achieves a search space size of $\mathcal{O}(3^{\frac{n}{3}} \cdot n^k)$, where $n$ is the number of vertices in the input graph. We prove that the worst-case number of maximal $k$-defective cliques is $Ω(3^{\frac{n}{3}} \cdot n^k)$ when $k$ is a constant, establishing that our algorithm's search space is worst-case optimal. Leveraging the diameter-two property of defective cliques, we further reduce the search space size to $\mathcal{O}(n \cdot 3^{\fracδ{3}} \cdot (δΔ)^k)$, where $δ$ is the degeneracy and $Δ$ is the maximum degree of the input graph. We also propose an efficient framework for maximum $k$-defective clique search based on our branch-and-bound, together with practical techniques to reduce the search space. Experiments on real-world benchmark datasets with more than 1 million edges demonstrate that each of our proposed algorithms for maximal $k$-defective clique enumeration and maximum $k$-defective clique search outperforms the respective state-of-the-art algorithms by up to four orders of magnitude in terms of processing time.
