Table of Contents
Fetching ...

Graph Similarity Computation via Interpretable Neural Node Alignment

Jingjing Wang, Hongjie Zhu, Haoran Xie, Fu Lee Wang, Xiaoliang Xu, Yuxiang Wang

TL;DR

The paper tackles graph similarity by approximating Graph Edit Distance ($GED$) and Maximum Common Subgraph ($MCS$) with an interpretable neural framework. It introduces Graph Neural Alignment (GNA), which jointly learns node embeddings, a cost matrix, and a differentiable hard alignment via a $Gumbel$-$Sinkhorn$ module to produce a bijective-like node matching while maintaining differentiability. The method achieves state-of-the-art or competitive results on GED prediction and graph retrieval across three real-world datasets, and provides visualizable node-level alignments that enhance interpretability. This approach enables practical, scalable graph similarity computations with transparent edit operations, and suggests extensions to edge alignment and heterogeneous graphs.

Abstract

\Graph similarity computation is an essential task in many real-world graph-related applications such as retrieving the similar drugs given a query chemical compound or finding the user's potential friends from the social network database. Graph Edit Distance (GED) and Maximum Common Subgraphs (MCS) are the two commonly used domain-agnostic metrics to evaluate graph similarity in practice. Unfortunately, computing the exact GED is known to be a NP-hard problem. To solve this limitation, neural network based models have been proposed to approximate the calculations of GED/MCS. However, deep learning models are well-known ``black boxes'', thus the typically characteristic one-to-one node/subgraph alignment process in the classical computations of GED and MCS cannot be seen. Existing methods have paid attention to approximating the node/subgraph alignment (soft alignment), but the one-to-one node alignment (hard alignment) has not yet been solved. To fill this gap, in this paper we propose a novel interpretable neural node alignment model without relying on node alignment ground truth information. Firstly, the quadratic assignment problem in classical GED computation is relaxed to a linear alignment via embedding the features in the node embedding space. Secondly, a differentiable Gumbel-Sinkhorn module is proposed to unsupervised generate the optimal one-to-one node alignment matrix. Experimental results in real-world graph datasets demonstrate that our method outperforms the state-of-the-art methods in graph similarity computation and graph retrieval tasks, achieving up to 16\% reduction in the Mean Squared Error and up to 12\% improvement in the retrieval evaluation metrics, respectively.

Graph Similarity Computation via Interpretable Neural Node Alignment

TL;DR

The paper tackles graph similarity by approximating Graph Edit Distance () and Maximum Common Subgraph () with an interpretable neural framework. It introduces Graph Neural Alignment (GNA), which jointly learns node embeddings, a cost matrix, and a differentiable hard alignment via a - module to produce a bijective-like node matching while maintaining differentiability. The method achieves state-of-the-art or competitive results on GED prediction and graph retrieval across three real-world datasets, and provides visualizable node-level alignments that enhance interpretability. This approach enables practical, scalable graph similarity computations with transparent edit operations, and suggests extensions to edge alignment and heterogeneous graphs.

Abstract

\Graph similarity computation is an essential task in many real-world graph-related applications such as retrieving the similar drugs given a query chemical compound or finding the user's potential friends from the social network database. Graph Edit Distance (GED) and Maximum Common Subgraphs (MCS) are the two commonly used domain-agnostic metrics to evaluate graph similarity in practice. Unfortunately, computing the exact GED is known to be a NP-hard problem. To solve this limitation, neural network based models have been proposed to approximate the calculations of GED/MCS. However, deep learning models are well-known ``black boxes'', thus the typically characteristic one-to-one node/subgraph alignment process in the classical computations of GED and MCS cannot be seen. Existing methods have paid attention to approximating the node/subgraph alignment (soft alignment), but the one-to-one node alignment (hard alignment) has not yet been solved. To fill this gap, in this paper we propose a novel interpretable neural node alignment model without relying on node alignment ground truth information. Firstly, the quadratic assignment problem in classical GED computation is relaxed to a linear alignment via embedding the features in the node embedding space. Secondly, a differentiable Gumbel-Sinkhorn module is proposed to unsupervised generate the optimal one-to-one node alignment matrix. Experimental results in real-world graph datasets demonstrate that our method outperforms the state-of-the-art methods in graph similarity computation and graph retrieval tasks, achieving up to 16\% reduction in the Mean Squared Error and up to 12\% improvement in the retrieval evaluation metrics, respectively.

Paper Structure

This paper contains 21 sections, 14 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: The framework of the GNA.
  • Figure 2: Ablation study result. The top subfigures denotes on the original test set, the bottom ones are conducted on the filtered dataset.
  • Figure 3: Visualizations of graph ranking result on AIDS. Nodes with the same color have the same label.
  • Figure 4: A graph matching case study for GNA on AIDS datasets. Heatmap denotes the final matching matrix at the node-level based on the node features generated by matching module. The depth of the color in the heat map indicates the weight of the node matching. The deeper the color, the more matched it is. The red rectangular boxes in the right of the visualize result denotes the node needed to be deleted.