Table of Contents
Fetching ...

Efficient Enumeration of Large Maximal k-Plexes

Qihao Cheng, Da Yan, Tianhao Wu, Lyuheng Yuan, Ji Cheng, Zhongyi Huang, Yang Zhou

TL;DR

The paper tackles exact enumeration of all maximal $k$-plexes of size at least $q$ in large graphs, a problem that is NP-hard for general $k$. It introduces a fast branch-and-bound framework that partitions the search space into independent tasks using seed subgraphs derived from a degeneracy ordering, and leverages a novel pivot strategy that maximizes saturated vertices to shrink candidates. Tight upper bounds and three vertex-pair pruning rules substantially prune the search, while a task-based parallelization with a timeout mechanism mitigates stragglers and preserves cache locality. The approach achieves substantial speedups over state-of-the-art methods both sequentially (up to $5\times$) and in parallel (up to $18.9\times$ with 16 threads), with ablations showing up to $7\times$ gains from pruning strategies. These results enable efficient discovery of large, cohesive subgraphs in biology and social networks, where $k$-plexes provide a robust alternative to cliques in noisy data.

Abstract

Finding cohesive subgraphs in a large graph has many important applications, such as community detection and biological network analysis. Clique is often a too strict cohesive structure since communities or biological modules rarely form as cliques for various reasons such as data noise. Therefore, $k$-plex is introduced as a popular clique relaxation, which is a graph where every vertex is adjacent to all but at most $k$ vertices. In this paper, we propose a fast branch-and-bound algorithm as well as its task-based parallel version to enumerate all maximal $k$-plexes with at least $q$ vertices. Our algorithm adopts an effective search space partitioning approach that provides a lower time complexity, a new pivot vertex selection method that reduces candidate vertex size, an effective upper-bounding technique to prune useless branches, and three novel pruning techniques by vertex pairs. Our parallel algorithm uses a timeout mechanism to eliminate straggler tasks, and maximizes cache locality while ensuring load balancing. Extensive experiments show that compared with the state-of-the-art algorithms, our sequential and parallel algorithms enumerate large maximal $k$-plexes with up to $5 \times$ and $18.9 \times$ speedup, respectively. Ablation results also demonstrate that our pruning techniques bring up to $7 \times$ speedup compared with our basic algorithm.

Efficient Enumeration of Large Maximal k-Plexes

TL;DR

The paper tackles exact enumeration of all maximal -plexes of size at least in large graphs, a problem that is NP-hard for general . It introduces a fast branch-and-bound framework that partitions the search space into independent tasks using seed subgraphs derived from a degeneracy ordering, and leverages a novel pivot strategy that maximizes saturated vertices to shrink candidates. Tight upper bounds and three vertex-pair pruning rules substantially prune the search, while a task-based parallelization with a timeout mechanism mitigates stragglers and preserves cache locality. The approach achieves substantial speedups over state-of-the-art methods both sequentially (up to ) and in parallel (up to with 16 threads), with ablations showing up to gains from pruning strategies. These results enable efficient discovery of large, cohesive subgraphs in biology and social networks, where -plexes provide a robust alternative to cliques in noisy data.

Abstract

Finding cohesive subgraphs in a large graph has many important applications, such as community detection and biological network analysis. Clique is often a too strict cohesive structure since communities or biological modules rarely form as cliques for various reasons such as data noise. Therefore, -plex is introduced as a popular clique relaxation, which is a graph where every vertex is adjacent to all but at most vertices. In this paper, we propose a fast branch-and-bound algorithm as well as its task-based parallel version to enumerate all maximal -plexes with at least vertices. Our algorithm adopts an effective search space partitioning approach that provides a lower time complexity, a new pivot vertex selection method that reduces candidate vertex size, an effective upper-bounding technique to prune useless branches, and three novel pruning techniques by vertex pairs. Our parallel algorithm uses a timeout mechanism to eliminate straggler tasks, and maximizes cache locality while ensuring load balancing. Extensive experiments show that compared with the state-of-the-art algorithms, our sequential and parallel algorithms enumerate large maximal -plexes with up to and speedup, respectively. Ablation results also demonstrate that our pruning techniques bring up to speedup compared with our basic algorithm.
Paper Structure (24 sections, 16 theorems, 9 equations, 14 figures, 7 tables, 4 algorithms)

This paper contains 24 sections, 16 theorems, 9 equations, 14 figures, 7 tables, 4 algorithms.

Key Result

Theorem 3.2

(Hereditariness) Given a $k$-plex $P\subseteq V$, any subset $P'\subseteq P$ is also a $k$-plex.

Figures (14)

  • Figure 1: Set-Enumeration Search Tree
  • Figure 2: Decomposition of Top-Level Task $T_{v_i}$
  • Figure 3: A Toy Graph for Illustration
  • Figure 4: Upper Bound Illustration for Theorem \ref{['lemma::bound2']}
  • Figure 5: Upper Bound Illustration for Theorem \ref{['lemma::bound4']}
  • ...and 9 more figures

Theorems & Definitions (21)

  • Definition 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Definition 3.4
  • Theorem 3.5
  • Example 4.1: Pivot Selection
  • Theorem 5.1
  • Corollary 5.2
  • Theorem 5.3
  • Example 5.4
  • ...and 11 more