Table of Contents
Fetching ...

MuseGNN: Forming Scalable, Convergent GNN Layers that Minimize a Sampling-Based Energy

Haitian Jiang, Renjie Liu, Zengfeng Huang, Yichuan Wang, Xiao Yan, Zhenkun Cai, Minjie Wang, David Wipf

TL;DR

MuseGNN tackles scaling of energy-based unfolded GNNs by embedding a sampling-based graph-regularized energy into the learning objective. It defines an offline-subgraph energy $\ ell_{\\text{muse}}(\\mathbb{Y}, M)$ and optimizes via alternating minimization over subgraph embeddings and shared node summaries, with an online mean estimator linking subgraphs. The authors establish convergence guarantees for both the upper-level training and the lower-level energy descent under specific settings and demonstrate stability and competitive accuracy on extremely large homogeneous graphs, including benchmarks exceeding 1 TB. This approach delivers scalable, interpretable GNN layers that retain expressive power and competitive performance without prohibitive memory requirements.

Abstract

Among the many variants of graph neural network (GNN) architectures capable of modeling data with cross-instance relations, an important subclass involves layers designed such that the forward pass iteratively reduces a graph-regularized energy function of interest. In this way, node embeddings produced at the output layer dually serve as both predictive features for solving downstream tasks (e.g., node classification) and energy function minimizers that inherit transparent, exploitable inductive biases and interpretability. However, scaling GNN architectures constructed in this way remains challenging, in part because the convergence of the forward pass may involve models with considerable depth. To tackle this limitation, we propose a sampling-based energy function and scalable GNN layers that iteratively reduce it, guided by convergence guarantees in certain settings. We also instantiate a full GNN architecture based on these designs, and the model achieves competitive accuracy and scalability when applied to the largest publicly-available node classification benchmark exceeding 1TB in size. Our source code is available at https://github.com/haitian-jiang/MuseGNN.

MuseGNN: Forming Scalable, Convergent GNN Layers that Minimize a Sampling-Based Energy

TL;DR

MuseGNN tackles scaling of energy-based unfolded GNNs by embedding a sampling-based graph-regularized energy into the learning objective. It defines an offline-subgraph energy and optimizes via alternating minimization over subgraph embeddings and shared node summaries, with an online mean estimator linking subgraphs. The authors establish convergence guarantees for both the upper-level training and the lower-level energy descent under specific settings and demonstrate stability and competitive accuracy on extremely large homogeneous graphs, including benchmarks exceeding 1 TB. This approach delivers scalable, interpretable GNN layers that retain expressive power and competitive performance without prohibitive memory requirements.

Abstract

Among the many variants of graph neural network (GNN) architectures capable of modeling data with cross-instance relations, an important subclass involves layers designed such that the forward pass iteratively reduces a graph-regularized energy function of interest. In this way, node embeddings produced at the output layer dually serve as both predictive features for solving downstream tasks (e.g., node classification) and energy function minimizers that inherit transparent, exploitable inductive biases and interpretability. However, scaling GNN architectures constructed in this way remains challenging, in part because the convergence of the forward pass may involve models with considerable depth. To tackle this limitation, we propose a sampling-based energy function and scalable GNN layers that iteratively reduce it, guided by convergence guarantees in certain settings. We also instantiate a full GNN architecture based on these designs, and the model achieves competitive accuracy and scalability when applied to the largest publicly-available node classification benchmark exceeding 1TB in size. Our source code is available at https://github.com/haitian-jiang/MuseGNN.
Paper Structure (45 sections, 10 theorems, 31 equations, 5 figures, 7 tables, 1 algorithm)

This paper contains 45 sections, 10 theorems, 31 equations, 5 figures, 7 tables, 1 algorithm.

Key Result

Proposition 3.1

Suppose we have $m$ subgraphs $(\mathcal{V}_1,\mathcal{E}_1),\ldots,(\mathcal{V}_m,\mathcal{E}_m)$ constructed independently such that $\forall s=1,\ldots,m, \forall u,v\in \mathcal{V}, \Pr[v\in\mathcal{V}_s]=\Pr[v\in\mathcal{V}_s\mid u\in\mathcal{V}_s]=p; (i,j)\in\mathcal{E}_s \iff i\in\mathcal{V}_

Figures (5)

  • Figure 1: MuseGNN vs. existing methods on largest graphs (LG), where 'top acc.' refers to top LG accuracy. Note that the convergence guarantee and greater energy expressivity are specifically defined w.r.t. UGNN models, hence 'N/A' for non-UGNNs without a lower-level energy.
  • Figure 2: Building on analysis from expressiveness, it is possible to achieve increased expressiveness via energy functions based on sampled subgraphs (as incorporated by MuseGNN).
  • Figure 3: Convergence of the upper-level loss on ogbn-arxiv dataset for 20 epochs with penalty factor $\gamma=1$.
  • Figure : MuseGNN Training Procedure
  • Figure : Training speed (epoch time) in seconds; hardware configurations in \ref{['sec:exp-detail']}.

Theorems & Definitions (21)

  • Proposition 3.1
  • Definition 5.1
  • Theorem 5.2
  • Theorem 5.3
  • Proposition C.1
  • proof
  • proof
  • Lemma E.1
  • proof
  • Lemma E.2
  • ...and 11 more