Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning

Zheng Huang; Qihui Yang; Dawei Zhou; Yujun Yan

Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning

Zheng Huang, Qihui Yang, Dawei Zhou, Yujun Yan

TL;DR

This paper tackles size generalization in graph neural networks by disentangling size-related from task-related information. It introduces DisGen, which uses size- and task-invariant graph augmentations and a decoupling loss to minimize shared information between size and task representations, with theoretical guarantees. The approach is model-agnostic and validated across multiple datasets and backbones, achieving up to a 6% improvement on larger test graphs. The work advances practical size generalization for GNNs and offers a principled framework for disentangled representation learning in graphs.

Abstract

Although most graph neural networks (GNNs) can operate on graphs of any size, their classification performance often declines on graphs larger than those encountered during training. Existing methods insufficiently address the removal of size information from graph representations, resulting in sub-optimal performance and reliance on backbone models. In response, we propose DISGEN, a novel and model-agnostic framework designed to disentangle size factors from graph representations. DISGEN employs size- and task-invariant augmentations and introduces a decoupling loss that minimizes shared information in hidden representations, with theoretical guarantees for its effectiveness. Our empirical results show that DISGEN outperforms the state-of-the-art models by up to 6% on real-world datasets, underscoring its effectiveness in enhancing the size generalizability of GNNs. Our codes are available at: https://github.com/GraphmindDartmouth/DISGEN.

Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning

TL;DR

Abstract

Paper Structure (19 sections, 5 theorems, 36 equations, 2 figures, 12 tables)

This paper contains 19 sections, 5 theorems, 36 equations, 2 figures, 12 tables.

Introduction
Preliminary
Methodology
Framework Overview
Augmentation
Decoupling Loss
Design
Theoretical Analysis
Experiments
Experimental Setup
Effectiveness of DisGen
Ablation Study
Related Work
Conclusion
Lemmas in Theoretical Analysis
...and 4 more sections

Key Result

Theorem 3.2

Consider the composite functions $\textbf{\rm{ENC}}_i \circ \bold{f} (\cdot)$, $i \in {1,2}$, defined on a closed set $S \in \mathbb{R}^{2c}$. Assume that these composite functions are twice differentiable at some point $\bold{r}_0$, and the gradients $\nabla\bold{h}_t$ and $\nabla\bold{h}_s$ at $\b $\Rightarrow$$\tilde{\bold{t}}$ and $\tilde{\bold{s}}$ can not be decoupled from $\bold{f}(\cdot,\c

Figures (2)

Figure 1: Framework overview: our model augments each graph $\mathcal{G}_i$ with size- and task-invariant views ($\mathcal{G}_i^{(1)}$ and $\mathcal{G}_i^{(2)}$), which, along with the original graph, are processed by a shared GNN backbone. Two encoders then generate size- ($\bold{s}_i$) and task-related ($\bold{t}_i$) representations, respectively. A contrastive loss on size-related representations guides relative size learning, while a decoupling loss ensures the separation of size- and task-related information.
Figure 2: Augmentation overview: view $\mathcal{G}_i^{(1)}$ is generated by removing edges that most significantly change the label information, while $\mathcal{G}_i^{(2)}$ results from eliminating nodes that have little impact on the model predictions.

Theorems & Definitions (10)

Definition 3.1
Theorem 3.2
proof
Theorem 3.3
proof
Definition 3.4
Theorem 3.5
proof
Lemma 1.1
Lemma 1.2

Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning

TL;DR

Abstract

Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (10)