Table of Contents
Fetching ...

GEDAN: Learning the Edit Costs for Graph Edit Distance

Francesco Leonardi, Markus Orsi, Jean-Louis Reymond, Kaspar Riesen

TL;DR

This paper tackles the challenge of aligning Graph Edit Distance (GED) with task-specific similarity by learning context-aware edit costs rather than using fixed unit costs, addressing GED’s NP-hard nature. It introduces GEDAN, a fully differentiable framework that couples Graph Isomorphism Network (GIN)–based multi-scale node representations with a Generalized Additive Model (GAM) to learn node-/edge-level edit costs, and uses a pre-trained Gumbel-Sinkhorn module to approximate a Linear Sum Assignment Problem in an unsupervised, self-organizing manner. The model supports both unsupervised GED approximation and supervised training with a contrastive loss and downstream-task supervision, enabling end-to-end optimization of the cost structure. Experiments on molecular and synthetic datasets demonstrate competitive GED approximation performance, improved downstream predictive power when costs are learned end-to-end, and interpretable cost maps that highlight chemically or structurally relevant regions, illustrating the method’s practical impact for domains like molecular analysis. Limitations include computational overhead and a maximum graph size of 128 nodes due to the Gumbel-Sinkhorn module, guiding future work toward scalability and broader graph types while seeking theoretical guarantees on learned costs.

Abstract

Graph Edit Distance (GED) is defined as the minimum cost transformation of one graph into another and is a widely adopted metric for measuring the dissimilarity between graphs. The major problem of GED is that its computation is NP-hard, which has in turn led to the development of various approximation methods, including approaches based on neural networks (NN). However, most NN methods assume a unit cost for edit operations -- a restrictive and often unrealistic simplification, since topological and functional distances rarely coincide in real-world data. In this paper, we propose a fully end-to-end Graph Neural Network framework for learning the edit costs for GED, at a fine-grained level, aligning topological and task-specific similarity. Our method combines an unsupervised self-organizing mechanism for GED approximation with a Generalized Additive Model that flexibly learns contextualized edit costs. Experiments demonstrate that our approach overcomes the limitations of non-end-to-end methods, yielding directly interpretable graph matchings, uncovering meaningful structures in complex graphs, and showing strong applicability to domains such as molecular analysis.

GEDAN: Learning the Edit Costs for Graph Edit Distance

TL;DR

This paper tackles the challenge of aligning Graph Edit Distance (GED) with task-specific similarity by learning context-aware edit costs rather than using fixed unit costs, addressing GED’s NP-hard nature. It introduces GEDAN, a fully differentiable framework that couples Graph Isomorphism Network (GIN)–based multi-scale node representations with a Generalized Additive Model (GAM) to learn node-/edge-level edit costs, and uses a pre-trained Gumbel-Sinkhorn module to approximate a Linear Sum Assignment Problem in an unsupervised, self-organizing manner. The model supports both unsupervised GED approximation and supervised training with a contrastive loss and downstream-task supervision, enabling end-to-end optimization of the cost structure. Experiments on molecular and synthetic datasets demonstrate competitive GED approximation performance, improved downstream predictive power when costs are learned end-to-end, and interpretable cost maps that highlight chemically or structurally relevant regions, illustrating the method’s practical impact for domains like molecular analysis. Limitations include computational overhead and a maximum graph size of 128 nodes due to the Gumbel-Sinkhorn module, guiding future work toward scalability and broader graph types while seeking theoretical guarantees on learned costs.

Abstract

Graph Edit Distance (GED) is defined as the minimum cost transformation of one graph into another and is a widely adopted metric for measuring the dissimilarity between graphs. The major problem of GED is that its computation is NP-hard, which has in turn led to the development of various approximation methods, including approaches based on neural networks (NN). However, most NN methods assume a unit cost for edit operations -- a restrictive and often unrealistic simplification, since topological and functional distances rarely coincide in real-world data. In this paper, we propose a fully end-to-end Graph Neural Network framework for learning the edit costs for GED, at a fine-grained level, aligning topological and task-specific similarity. Our method combines an unsupervised self-organizing mechanism for GED approximation with a Generalized Additive Model that flexibly learns contextualized edit costs. Experiments demonstrate that our approach overcomes the limitations of non-end-to-end methods, yielding directly interpretable graph matchings, uncovering meaningful structures in complex graphs, and showing strong applicability to domains such as molecular analysis.

Paper Structure

This paper contains 34 sections, 25 equations, 29 figures, 19 tables.

Figures (29)

  • Figure 1: (a) shows the structure of the five matrices used in GEDAN. (b) is an example of how the distance $d^{(k)}$ penalizes mismatched nodes, allowing $\mathbf{P}$ to more precisely identify graph correspondences.
  • Figure 2: Illustrative example of cost learning via $f_{\theta}^{(k)}$. Here, a single-node difference drives the dissimilarity, and the substitution cost is learned to align the GED with the functional discrepancy.
  • Figure 3: Cost analysis of models on the FreeSolv dataset. (a) shows a direct comparison between two molecules using GEDAN, while (b) and (c) show the overall cost analyses.
  • Figure 4: The architectural scheme of GEDAN
  • Figure 5: Matching example on PTC MR using $\mathbf{M}_k$ matrices.
  • ...and 24 more figures