Efficient Subgraph GNNs by Learning Effective Selection Policies

Beatrice Bevilacqua; Moshe Eliasof; Eli Meirom; Bruno Ribeiro; Haggai Maron

Efficient Subgraph GNNs by Learning Effective Selection Policies

Beatrice Bevilacqua, Moshe Eliasof, Eli Meirom, Bruno Ribeiro, Haggai Maron

TL;DR

This work addresses the high computational cost of Subgraph GNNs by learning data-driven policies to select a small, task-relevant subset of subgraphs. It introduces Policy-Learn, a two-network system where a DS-GNN-based selection module builds a bag of subgraphs iteratively using differentiable sampling, and a downstream DS-GNN predictor executes the task on the selected bag. Theoretical results show Policy-Learn can identify isomorphism types for WL-indistinguishable graph families with as few as ℓ subgraphs, outperforming random policies and prior methods, and empirical results demonstrate competitive accuracy with significant runtime savings across diverse datasets. The approach enables efficient, scalable subgraph-based graph representations while preserving expressive power for practical applications. Overall, Policy-Learn advances the practical deployment of Subgraph GNNs by learning when and which subgraphs to process.

Abstract

Subgraph GNNs are provably expressive neural architectures that learn graph representations from sets of subgraphs. Unfortunately, their applicability is hampered by the computational complexity associated with performing message passing on many subgraphs. In this paper, we consider the problem of learning to select a small subset of the large set of possible subgraphs in a data-driven fashion. We first motivate the problem by proving that there are families of WL-indistinguishable graphs for which there exist efficient subgraph selection policies: small subsets of subgraphs that can already identify all the graphs within the family. We then propose a new approach, called Policy-Learn, that learns how to select subgraphs in an iterative manner. We prove that, unlike popular random policies and prior work addressing the same problem, our architecture is able to learn the efficient policies mentioned above. Our experimental results demonstrate that Policy-Learn outperforms existing baselines across a wide range of datasets.

Efficient Subgraph GNNs by Learning Effective Selection Policies

TL;DR

Abstract

Paper Structure (36 sections, 14 theorems, 13 equations, 3 figures, 12 tables, 1 algorithm)

This paper contains 36 sections, 14 theorems, 13 equations, 3 figures, 12 tables, 1 algorithm.

Introduction
Related Work
Problem Formulation
Objective.
Insights for the Subgraph Selection Learning Problem
Powerful policies containing only a single subgraph.
Families of graphs that require specific selections of $\ell$ subgraphs.
Method
Overview
Subgraph Selection Network
Backpropagating through the sampling process.
Downstream Prediction Network
Theoretical Analysis
Experiments
Conclusions
...and 21 more sections

Key Result

Theorem 1

Let $\mathcal{G}_{n, \ell}$ be the family of non-isomorphic$(n, \ell)$-CSL graphs (def:2csl). All graphs in $\mathcal{G}_{n, \ell}$ are WL-indistinguishable.

Figures (3)

Figure 1: Sufficiency of small bags. Two non-isomorphic graphs (Circulant Skip Links graphs with 13 nodes and skip length 5 and 3, respectively) that can be distinguished using a bag containing only a single subgraph, generated by marking one node.
Figure 2: An $(n, \ell)$-CSL graph is obtained from $\ell$ disconnected, non-isomorphic CSL graphs with $n$ nodes, where $n=13$ in this case.
Figure 3: An overview of Policy-learn. Policy-learn consists of two Subgraph GNNs: a selection network and a prediction network. The selection network generates the bag of subgraphs by iteratively parameterizing a probability distribution over the nodes of the original graph. When the bag size reaches its maximal size $T+1$ (the original graph plus $T$ selections), it is passed to the prediction network for the downstream task.

Theorems & Definitions (25)

Definition 1: CSL graph murphy2019relational
Definition 2: $(n, \ell)$-CSL graph
Theorem 1: $(n, \csls)$-CSL graphs are WL-indistinguishable
Proposition 1: There exists an efficient $\pi$ that fully identifies $(n, \csls)$-CSL graphs
Proposition 1: A random policy cannot efficiently identify $(n, \csls)$-CSL graphs
Theorem 2: can identify $\mathcal{G}_{n, \csls}$
Theorem 3: qian2022ordered cannot efficiently identify $\mathcal{G}_{n, \csls}$
Theorem 3: $(n, \csls)$-CSL graphs are WL-indistinguishable
proof
Proposition 3: There exists an efficient $\pi$ that fully identifies $(n, \csls)$-CSL graphs
...and 15 more

Efficient Subgraph GNNs by Learning Effective Selection Policies

TL;DR

Abstract

Efficient Subgraph GNNs by Learning Effective Selection Policies

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (25)