Table of Contents
Fetching ...

DiffGED: Computing Graph Edit Distance via Diffusion-based Graph Matching

Wei Huang, Hanchen Wang, Dong Wen, Wenjie Zhang, Ying Zhang, Xuemin Lin

TL;DR

This work tackles the NP-hard graph edit distance problem by reframing GED as a generative, diffusion-driven graph matching task. DiffGED (via DiffMatch) produces multiple diverse node-matching matrices in parallel and extracts corresponding node mappings in parallel to derive multiple edit paths, selecting the best path with minimal edits. The method achieves high accuracy close to exact solutions while offering substantial runtime advantages over prior hybrid approaches, and it provides diverse, interpretable edit paths that reflect multimodal solution spaces. Overall, DiffGED advances scalable, path-recovering GED computation with practical applicability to large real-world graphs.

Abstract

The Graph Edit Distance (GED) problem, which aims to compute the minimum number of edit operations required to transform one graph into another, is a fundamental challenge in graph analysis with wide-ranging applications. However, due to its NP-hard nature, traditional A* approaches often suffer from scalability issue, making them computationally intractable for large graphs. Many recent deep learning frameworks address GED by formulating it as a regression task, which, while efficient, fails to recover the edit path -- a central interest in GED. Furthermore, recent hybrid approaches that combine deep learning with traditional methods to recover the edit path often yield poor solution quality. These methods also struggle to generate candidate solutions in parallel, resulting in increased running times.In this paper, we present a novel approach, DiffGED, that leverages generative diffusion model to solve GED and recover the corresponding edit path. Specifically, we first generate multiple diverse node matching matrices in parallel through a diffusion-based graph matching model. Next, node mappings are extracted from each generated matching matrices in parallel, and each extracted node mapping can be simply transformed into an edit path. Benefiting from the generative diversity provided by the diffusion model, DiffGED is less likely to fall into local sub-optimal solutions, thereby achieving superior overall solution quality close to the exact solution. Experimental results on real-world datasets demonstrate that DiffGED can generate multiple diverse edit paths with exceptionally high accuracy comparable to exact solutions while maintaining a running time shorter than most of hybrid approaches.

DiffGED: Computing Graph Edit Distance via Diffusion-based Graph Matching

TL;DR

This work tackles the NP-hard graph edit distance problem by reframing GED as a generative, diffusion-driven graph matching task. DiffGED (via DiffMatch) produces multiple diverse node-matching matrices in parallel and extracts corresponding node mappings in parallel to derive multiple edit paths, selecting the best path with minimal edits. The method achieves high accuracy close to exact solutions while offering substantial runtime advantages over prior hybrid approaches, and it provides diverse, interpretable edit paths that reflect multimodal solution spaces. Overall, DiffGED advances scalable, path-recovering GED computation with practical applicability to large real-world graphs.

Abstract

The Graph Edit Distance (GED) problem, which aims to compute the minimum number of edit operations required to transform one graph into another, is a fundamental challenge in graph analysis with wide-ranging applications. However, due to its NP-hard nature, traditional A* approaches often suffer from scalability issue, making them computationally intractable for large graphs. Many recent deep learning frameworks address GED by formulating it as a regression task, which, while efficient, fails to recover the edit path -- a central interest in GED. Furthermore, recent hybrid approaches that combine deep learning with traditional methods to recover the edit path often yield poor solution quality. These methods also struggle to generate candidate solutions in parallel, resulting in increased running times.In this paper, we present a novel approach, DiffGED, that leverages generative diffusion model to solve GED and recover the corresponding edit path. Specifically, we first generate multiple diverse node matching matrices in parallel through a diffusion-based graph matching model. Next, node mappings are extracted from each generated matching matrices in parallel, and each extracted node mapping can be simply transformed into an edit path. Benefiting from the generative diversity provided by the diffusion model, DiffGED is less likely to fall into local sub-optimal solutions, thereby achieving superior overall solution quality close to the exact solution. Experimental results on real-world datasets demonstrate that DiffGED can generate multiple diverse edit paths with exceptionally high accuracy comparable to exact solutions while maintaining a running time shorter than most of hybrid approaches.

Paper Structure

This paper contains 30 sections, 7 equations, 8 figures, 6 tables, 4 algorithms.

Figures (8)

  • Figure 1: The optimal edit paths for transforming $G$ to $G'$. GED$(G,G')=4$.
  • Figure 2: An example of top-$k$ maximum weight node mappings extracted from a biased and sparse predicted node matching matrix.
  • Figure 3: An overview of DiffGED. In the first phase, DiffGED first samples $k$ random initial node matching matrices, then DiffMatch will denoise the sampled node matching matrices. In the second phase, one node mapping will be extracted from each node matching matrix in parallel, and edit paths will be derived from the node mappings.
  • Figure 4: An overview of the denoising network. The blue area denotes the network input, the yellow area denotes the architecture of the denoising network, and the pink area denotes the network output.
  • Figure 5: Effectiveness and Efficiency of Top-$k$ Approaches
  • ...and 3 more figures