Table of Contents
Fetching ...

Hierarchy-Aware Neural Subgraph Matching with Enhanced Similarity Measure

Zhouyang Liu, Ning Liu, Yixin Chen, Jiezhong He, Menghan Jia, Dongsheng Li

TL;DR

This work tackles neural subgraph matching by addressing two core shortcomings of prior NSM methods: scale sensitivity in encoding and lack of discriminative ranking in scoring. It introduces NC-Iso, which combines a hierarchy-aware GNN encoder that preserves relative feature positions within node-rooted subtrees with a novel similarity measure, the similarity dominance ratio (SDR), for robust scoring. The scoring function $\Psi(Q,D)$ normalizes containment violations via $Compliance(Q,D)$ and quantifies similarity dominance through $SDR(Q,D)$, enabling effective ranking of matched pairs. Extensive experiments on nine datasets demonstrate strong generalization to larger graphs, scalability to large graphs, and competitiveness against both neural baselines and conventional subgraph matching algorithms, with code available for reproducibility. Overall, NC-Iso provides a practical, discriminative, and scalable solution for subgraph retrieval tasks in diverse domains.

Abstract

Subgraph matching is challenging as it necessitates time-consuming combinatorial searches. Recent Graph Neural Network (GNN)-based approaches address this issue by employing GNN encoders to extract graph information and hinge distance measures to ensure containment constraints in the embedding space. These methods significantly shorten the response time, making them promising solutions for subgraph retrieval. However, they suffer from scale differences between graph pairs during encoding, as they focus on feature counts but overlook the relative positions of features within node-rooted subtrees, leading to disturbed containment constraints and false predictions. Additionally, their hinge distance measures lack discriminative power for matched graph pairs, hindering ranking applications. We propose NC-Iso, a novel GNN architecture for neural subgraph matching. NC-Iso preserves the relative positions of features by building the hierarchical dependencies between adjacent echelons within node-rooted subtrees, ensuring matched graph pairs maintain consistent hierarchies while complying with containment constraints in feature counts. To enhance the ranking ability for matched pairs, we introduce a novel similarity dominance ratio-enhanced measure, which quantifies the dominance of similarity over dissimilarity between graph pairs. Empirical results on nine datasets validate the effectiveness, generalization ability, scalability, and transferability of NC-Iso while maintaining time efficiency, offering a more discriminative neural subgraph matching solution for subgraph retrieval. Code available at https://github.com/liuzhouyang/NC-Iso.

Hierarchy-Aware Neural Subgraph Matching with Enhanced Similarity Measure

TL;DR

This work tackles neural subgraph matching by addressing two core shortcomings of prior NSM methods: scale sensitivity in encoding and lack of discriminative ranking in scoring. It introduces NC-Iso, which combines a hierarchy-aware GNN encoder that preserves relative feature positions within node-rooted subtrees with a novel similarity measure, the similarity dominance ratio (SDR), for robust scoring. The scoring function normalizes containment violations via and quantifies similarity dominance through , enabling effective ranking of matched pairs. Extensive experiments on nine datasets demonstrate strong generalization to larger graphs, scalability to large graphs, and competitiveness against both neural baselines and conventional subgraph matching algorithms, with code available for reproducibility. Overall, NC-Iso provides a practical, discriminative, and scalable solution for subgraph retrieval tasks in diverse domains.

Abstract

Subgraph matching is challenging as it necessitates time-consuming combinatorial searches. Recent Graph Neural Network (GNN)-based approaches address this issue by employing GNN encoders to extract graph information and hinge distance measures to ensure containment constraints in the embedding space. These methods significantly shorten the response time, making them promising solutions for subgraph retrieval. However, they suffer from scale differences between graph pairs during encoding, as they focus on feature counts but overlook the relative positions of features within node-rooted subtrees, leading to disturbed containment constraints and false predictions. Additionally, their hinge distance measures lack discriminative power for matched graph pairs, hindering ranking applications. We propose NC-Iso, a novel GNN architecture for neural subgraph matching. NC-Iso preserves the relative positions of features by building the hierarchical dependencies between adjacent echelons within node-rooted subtrees, ensuring matched graph pairs maintain consistent hierarchies while complying with containment constraints in feature counts. To enhance the ranking ability for matched pairs, we introduce a novel similarity dominance ratio-enhanced measure, which quantifies the dominance of similarity over dissimilarity between graph pairs. Empirical results on nine datasets validate the effectiveness, generalization ability, scalability, and transferability of NC-Iso while maintaining time efficiency, offering a more discriminative neural subgraph matching solution for subgraph retrieval. Code available at https://github.com/liuzhouyang/NC-Iso.

Paper Structure

This paper contains 23 sections, 12 equations, 10 figures, 9 tables, 1 algorithm.

Figures (10)

  • Figure 1: The $2$-hop label counts of nodes within $Q$ and $D$, represented by black-bordered squares. Due to scale differences, node $b$ in $D$'s $k$-hop label count exceeds that of nodes in $Q$, leading to a false positive match despite no structural alignment. Additionally, the additive nature of feature counts across nodes results in larger sums in data graphs, undermining coarse-grained graph-level containment constraints and exacerbating matching errors.
  • Figure 2: (Left) GNNs that use permutation-invariant combination function may struggle to distinguish $S_a$ and $S_b$. In contrast, the sequential combination introduces hierarchy awareness, rendering $S_a$ and $S_b$ distinguishable. (Right) Our proposed measure normalizes the hinge distance and considers the intersection and normalized difference between compared pairs.
  • Figure 3: The overview of NC-Iso.
  • Figure 4: Ranking results on matched pairs.
  • Figure 5: We alter the hinge distance measure in NeuroMatch with our proposed one. The validation AUROC on PROTEINS (Left) and MSRC_21 (Right) datasets demonstrate the substantial improvement brought by our proposed measure.
  • ...and 5 more figures