Table of Contents
Fetching ...

Fast Maximum Common Subgraph Search: A Redundancy-Reduced Backtracking Approach

Kaiqiang Yu, Kaixin Wang, Cheng Long, Laks Lakshmanan, Reynold Cheng

TL;DR

This work tackles the NP-hard maximum common subgraph problem by introducing RRSplit, a backtracking algorithm that combines redundancy-reducing reductions with a tighter upper bound to prune search space. The core innovations are vertex-equivalence-based reductions, maximality-based reductions, and a new vertex-equivalence-based upper bound, all supported by an exclusion structure to track past explorations. The authors prove a worst-case time complexity of $O^*((|V_G|+1)^{|V_Q|})$, matching the best-known theoretical bound, while demonstrating through extensive experiments that RRSplit dramatically outperforms state-of-the-art McSplit-based methods in practice. The results indicate strong potential for scalable exact MCS computation and suggest directions for extending the approach to labeled graphs and broader graph domains.

Abstract

Given two input graphs, finding the largest subgraph that occurs in both, i.e., finding the maximum common subgraph, is a fundamental operator for evaluating the similarity between two graphs in graph data analysis. Existing works for solving the problem are of either theoretical or practical interest, but not both. Specifically, the algorithms with a theoretical guarantee on the running time are known to be not practically efficient; algorithms following the recently proposed backtracking framework called McSplit, run fast in practice but do not have any theoretical guarantees. In this paper, we propose a new backtracking algorithm called RRSplit, which at once achieves better practical efficiency and provides a non-trivial theoretical guarantee on the worst-case running time. To achieve the former, we develop a series of reductions and upper bounds for reducing redundant computations, i.e., the time for exploring some unpromising branches of exploration that hold no maximum common subgraph. To achieve the latter, we formally prove that RRSplit incurs a worst-case time complexity which matches the best-known complexity for the problem. Finally, we conduct extensive experiments on four benchmark graph collections, and the results demonstrate that our algorithm outperforms the practical state-of-the-art by several orders of magnitude.

Fast Maximum Common Subgraph Search: A Redundancy-Reduced Backtracking Approach

TL;DR

This work tackles the NP-hard maximum common subgraph problem by introducing RRSplit, a backtracking algorithm that combines redundancy-reducing reductions with a tighter upper bound to prune search space. The core innovations are vertex-equivalence-based reductions, maximality-based reductions, and a new vertex-equivalence-based upper bound, all supported by an exclusion structure to track past explorations. The authors prove a worst-case time complexity of , matching the best-known theoretical bound, while demonstrating through extensive experiments that RRSplit dramatically outperforms state-of-the-art McSplit-based methods in practice. The results indicate strong potential for scalable exact MCS computation and suggest directions for extending the approach to labeled graphs and broader graph domains.

Abstract

Given two input graphs, finding the largest subgraph that occurs in both, i.e., finding the maximum common subgraph, is a fundamental operator for evaluating the similarity between two graphs in graph data analysis. Existing works for solving the problem are of either theoretical or practical interest, but not both. Specifically, the algorithms with a theoretical guarantee on the running time are known to be not practically efficient; algorithms following the recently proposed backtracking framework called McSplit, run fast in practice but do not have any theoretical guarantees. In this paper, we propose a new backtracking algorithm called RRSplit, which at once achieves better practical efficiency and provides a non-trivial theoretical guarantee on the worst-case running time. To achieve the former, we develop a series of reductions and upper bounds for reducing redundant computations, i.e., the time for exploring some unpromising branches of exploration that hold no maximum common subgraph. To achieve the latter, we formally prove that RRSplit incurs a worst-case time complexity which matches the best-known complexity for the problem. Finally, we conduct extensive experiments on four benchmark graph collections, and the results demonstrate that our algorithm outperforms the practical state-of-the-art by several orders of magnitude.

Paper Structure

This paper contains 16 sections, 6 theorems, 16 equations, 13 figures, 2 tables, 2 algorithms.

Key Result

Lemma 1

Let $S=\langle q,g,\phi \rangle$ be a common subgraph of given graphs $Q$ and $G$, $u$ be a vertex in $V_q$ and $u'$ be a vertex in $\Psi(u)$. Then one of the following cases holds.

Figures (13)

  • Figure 1: Input graphs used throughout the paper
  • Figure 2: Illustrating the backtracking process ("+" means to move vertex pairs from $C$ to $S$ and "-" means to remove vertex pairs from $C$)
  • Figure 3: Illustrating cs-isomorphism and vertex equivalence (vertices, denoted by colored bullet circles, induce subgraphs $q$, $q_1$, $q_2$ and $g$; vertices in $\{u_1,u_2\}$ and $\{u_4,u_6,u_7\}$ are structurally equivalent, respectively; $\langle q,g,\phi \rangle$ is cs-isomorphic to $\langle q_1,g,\phi_1 \rangle$ (Case 1, say, exchange the mapped vertices of $u_1$ and $u_2$) and $\langle q_2,g,\phi_2 \rangle$ (Case 2, say, replace $u_4$ with $u_7$) where vertices with the same color indicate the bijection)
  • Figure 4: Illustrating the exclusion set $D$ ($\langle u',v'\rangle$ is a vertex pair in $D$; $B_{anc}$ is an ancestor of $B$, where $u'$ is selected as the branching vertex)
  • Figure 5: Running time on all datasets. For those problem instances locating at the right side of dash line '- .' with orange color (resp. '- -' with green color), RRSplit achieves at least 100$\times$ (resp. 10$\times$) speedup compared with McSplitDAL.
  • ...and 8 more figures

Theorems & Definitions (18)

  • definition 1: Graph isomorphism mcgregor1982backtrack
  • definition 2: Induced subgraph isomorphism mcgregor1982backtrack
  • definition 3: Common induced subgraph mcgregor1982backtrack
  • Example 1
  • Example 2
  • Example 3
  • definition 4: Common subgraph isomorphism
  • definition 5: Structural equivalence nguyen2019applications
  • Lemma 1
  • Lemma 2
  • ...and 8 more