Table of Contents
Fetching ...

Maximum Degree-Based Quasi-Clique Search via an Iterative Framework

Hongbo Xia, Kaiqiang Yu, Shengxin Liu, Cheng Long, Xun Zhou

TL;DR

This paper tackles the NP-hard problem of finding the largest $\gamma$-quasi-clique in a graph by introducing IterQC, an iterative framework that reduces MaxQC to a sequence of hereditary $k$-plex problems. It adds two key optimizations: a preprocessing stage based on $k$-core bounds to shrink the problem and initialize the search, and a pseudo lower bound that improves branch-and-bound pruning within the $k$-plex solver. The authors prove the basic framework's correctness, provide a detailed time-complexity analysis, and demonstrate through extensive experiments that IterQC achieves up to four orders of magnitude speedup and solves many more instances than the prior best methods DDA and FastQC. The work offers scalable, exact MaxQC solutions on large real-world graphs, with potential extensions to other cohesive subgraph models.

Abstract

Cohesive subgraph mining is a fundamental problem in graph theory with numerous real-world applications, such as social network analysis and protein-protein interaction modeling. Among various cohesive subgraphs, the $γ$-quasi-clique is widely studied for its flexibility in requiring each vertex to connect to at least a $γ$ proportion of other vertices in the subgraph. However, solving the maximum $γ$-quasi-clique problem is NP-hard and further complicated by the lack of the hereditary property, which makes designing efficient pruning strategies challenging. Existing algorithms, such as DDA and FastQC, either struggle with scalability or exhibit significant performance declines for small values of $γ$. In this paper, we propose a novel algorithm, IterQC, which reformulates the maximum $γ$-quasi-clique problem as a series of $k$-plex problems that possess the hereditary property. IterQC introduces a non-trivial iterative framework and incorporates two key optimization techniques: (1) the pseudo lower bound (pseudo LB) technique, which leverages information across iterations to improve the efficiency of branch-and-bound searches, and (2) the preprocessing technique that reduces problem size and unnecessary iterations. Extensive experiments demonstrate that IterQC achieves up to four orders of magnitude speedup and solves significantly more graph instances compared to state-of-the-art algorithms DDA and FastQC.

Maximum Degree-Based Quasi-Clique Search via an Iterative Framework

TL;DR

This paper tackles the NP-hard problem of finding the largest -quasi-clique in a graph by introducing IterQC, an iterative framework that reduces MaxQC to a sequence of hereditary -plex problems. It adds two key optimizations: a preprocessing stage based on -core bounds to shrink the problem and initialize the search, and a pseudo lower bound that improves branch-and-bound pruning within the -plex solver. The authors prove the basic framework's correctness, provide a detailed time-complexity analysis, and demonstrate through extensive experiments that IterQC achieves up to four orders of magnitude speedup and solves many more instances than the prior best methods DDA and FastQC. The work offers scalable, exact MaxQC solutions on large real-world graphs, with potential extensions to other cohesive subgraph models.

Abstract

Cohesive subgraph mining is a fundamental problem in graph theory with numerous real-world applications, such as social network analysis and protein-protein interaction modeling. Among various cohesive subgraphs, the -quasi-clique is widely studied for its flexibility in requiring each vertex to connect to at least a proportion of other vertices in the subgraph. However, solving the maximum -quasi-clique problem is NP-hard and further complicated by the lack of the hereditary property, which makes designing efficient pruning strategies challenging. Existing algorithms, such as DDA and FastQC, either struggle with scalability or exhibit significant performance declines for small values of . In this paper, we propose a novel algorithm, IterQC, which reformulates the maximum -quasi-clique problem as a series of -plex problems that possess the hereditary property. IterQC introduces a non-trivial iterative framework and incorporates two key optimization techniques: (1) the pseudo lower bound (pseudo LB) technique, which leverages information across iterations to improve the efficiency of branch-and-bound searches, and (2) the preprocessing technique that reduces problem size and unnecessary iterations. Extensive experiments demonstrate that IterQC achieves up to four orders of magnitude speedup and solves significantly more graph instances compared to state-of-the-art algorithms DDA and FastQC.

Paper Structure

This paper contains 17 sections, 10 theorems, 8 figures, 10 tables, 5 algorithms.

Key Result

Lemma 1

When the input graph $G=(V,E)$ is a $\gamma$-QC, Algorithm alg:basic-framework correctly returns $|V|$ as the optimum solution.

Figures (8)

  • Figure 1: Number of solved instances with varying $\gamma$.
  • Figure 2: Number of solved instances on 10th DIMACS.
  • Figure 3: Number of solved instances on real-world.
  • Figure 4: Scalability test on G30.
  • Figure 5: An example of Algorithm \ref{['alg:basic-framework']} with $\gamma = 0.55$.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Definition 1
  • Definition 2: seidman1978kplex
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Corollary 1
  • Lemma 4
  • Lemma 5
  • Definition 3
  • Lemma 6
  • ...and 3 more