Table of Contents
Fetching ...

An Efficient Subgraph GNN with Provable Substructure Counting Power

Zuoyu Yan, Junru Zhou, Liangcai Gao, Zhi Tang, Muhan Zhang

TL;DR

This work targets efficient, provable substructure counting with graph neural networks. It introduces ESC-GNN, which augments backbones with precomputed distance-based structural embeddings over rooted $k$-tuples, enabling subgraph counting without repeatedly applying GNNs to all subgraphs. Theoretical results (Theorems 3.2–3.4, 4.2–4.5) characterize counting and distinguishing power relative to the Weisfeiler-Leman hierarchy and show ESC-GNN can count a range of connected and induced substructures while remaining more scalable than full subgraph GNNs. Empirically, ESC-GNN achieves strong real-world performance on molecular and TU benchmarks, with favorable space/time trade-offs and clear ablations illustrating the benefits of global message passing and all components of the structural encoding. Overall, the approach provides a practical, provably expressive model for substructure-aware graph learning with reduced computational cost.

Abstract

We investigate the enhancement of graph neural networks' (GNNs) representation power through their ability in substructure counting. Recent advances have seen the adoption of subgraph GNNs, which partition an input graph into numerous subgraphs, subsequently applying GNNs to each to augment the graph's overall representation. Despite their ability to identify various substructures, subgraph GNNs are hindered by significant computational and memory costs. In this paper, we tackle a critical question: Is it possible for GNNs to count substructures both \textbf{efficiently} and \textbf{provably}? Our approach begins with a theoretical demonstration that the distance to rooted nodes in subgraphs is key to boosting the counting power of subgraph GNNs. To avoid the need for repetitively applying GNN across all subgraphs, we introduce precomputed structural embeddings that encapsulate this crucial distance information. Experiments validate that our proposed model retains the counting power of subgraph GNNs while achieving significantly faster performance.

An Efficient Subgraph GNN with Provable Substructure Counting Power

TL;DR

This work targets efficient, provable substructure counting with graph neural networks. It introduces ESC-GNN, which augments backbones with precomputed distance-based structural embeddings over rooted -tuples, enabling subgraph counting without repeatedly applying GNNs to all subgraphs. Theoretical results (Theorems 3.2–3.4, 4.2–4.5) characterize counting and distinguishing power relative to the Weisfeiler-Leman hierarchy and show ESC-GNN can count a range of connected and induced substructures while remaining more scalable than full subgraph GNNs. Empirically, ESC-GNN achieves strong real-world performance on molecular and TU benchmarks, with favorable space/time trade-offs and clear ablations illustrating the benefits of global message passing and all components of the structural encoding. Overall, the approach provides a practical, provably expressive model for substructure-aware graph learning with reduced computational cost.

Abstract

We investigate the enhancement of graph neural networks' (GNNs) representation power through their ability in substructure counting. Recent advances have seen the adoption of subgraph GNNs, which partition an input graph into numerous subgraphs, subsequently applying GNNs to each to augment the graph's overall representation. Despite their ability to identify various substructures, subgraph GNNs are hindered by significant computational and memory costs. In this paper, we tackle a critical question: Is it possible for GNNs to count substructures both \textbf{efficiently} and \textbf{provably}? Our approach begins with a theoretical demonstration that the distance to rooted nodes in subgraphs is key to boosting the counting power of subgraph GNNs. To avoid the need for repetitively applying GNN across all subgraphs, we introduce precomputed structural embeddings that encapsulate this crucial distance information. Experiments validate that our proposed model retains the counting power of subgraph GNNs while achieving significantly faster performance.
Paper Structure (15 sections, 10 theorems, 7 equations, 6 figures, 7 tables)

This paper contains 15 sections, 10 theorems, 7 equations, 6 figures, 7 tables.

Key Result

Theorem 2.1

For any $k\geq 2$, there exists a pair of graphs $G$ and $H$, such that $G$ contains a $(k+1)$-clique as its subgraph while $H$ does not, and that $k$-WL can't distinguish $G$ from $H$.

Figures (6)

  • Figure 1: (a) the 4*4 Rook Graph and (b) the Shrikhande Graph
  • Figure 2: Examples of 4-cycles that pass the rooted edges. In these figures, the rooted 2-tuples are colored blue.
  • Figure 3: Examples where ESC-GNN cannot subgraph-count 5-cycles. In these figures, the rooted 2-tuples are colored blue.
  • Figure 4: Examples of stars that pass the rooted edges. In these figures, the rooted 2-tuples are colored blue.
  • Figure 5: Examples where ESC-GNN cannot induced-subgraph-count 5-stars. In these figures, the rooted 2-tuples are colored blue.
  • ...and 1 more figures

Theorems & Definitions (17)

  • Theorem 2.1
  • proof
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 2.4
  • Remark 2.5
  • Theorem 3.1
  • proof
  • Theorem 4.1
  • proof
  • ...and 7 more