Table of Contents
Fetching ...

Subgraph Pooling: Tackling Negative Transfer on Graphs

Zehong Wang, Zheyuan Zhang, Chuxu Zhang, Yanfang Ye

TL;DR

The paper studies negative transfer in graph neural networks caused by structural differences between semantically similar source and target graphs, which induce distribution shifts in node embeddings. It introduces Subgraph Pooling (SP) and Subgraph Pooling++ (SP++) to transfer subgraph‑level information by sampling $k$‑hop neighborhoods or random walks, thereby reducing the graph discrepancy measured by CMD and improving transfer performance. The authors provide theoretical analysis showing how SP reduces discrepancy and demonstrate, across diverse datasets and transfer settings, that SP/SP++ consistently outperform strong baselines with no extra learnable parameters and compatibility with any GNN backbone. The work offers a practical, efficient approach to robust graph transfer learning with broad applicability to evolving graph structures in real‑world tasks, and it is accompanied by public code and data.

Abstract

Transfer learning aims to enhance performance on a target task by using knowledge from related tasks. However, when the source and target tasks are not closely aligned, it can lead to reduced performance, known as negative transfer. Unlike in image or text data, we find that negative transfer could commonly occur in graph-structured data, even when source and target graphs have semantic similarities. Specifically, we identify that structural differences significantly amplify the dissimilarities in the node embeddings across graphs. To mitigate this, we bring a new insight in this paper: for semantically similar graphs, although structural differences lead to significant distribution shift in node embeddings, their impact on subgraph embeddings could be marginal. Building on this insight, we introduce Subgraph Pooling (SP) by aggregating nodes sampled from a k-hop neighborhood and Subgraph Pooling++ (SP++) by a random walk, to mitigate the impact of graph structural differences on knowledge transfer. We theoretically analyze the role of SP in reducing graph discrepancy and conduct extensive experiments to evaluate its superiority under various settings. The proposed SP methods are effective yet elegant, which can be easily applied on top of any backbone Graph Neural Networks (GNNs). Our code and data are available at: https://github.com/Zehong-Wang/Subgraph-Pooling.

Subgraph Pooling: Tackling Negative Transfer on Graphs

TL;DR

The paper studies negative transfer in graph neural networks caused by structural differences between semantically similar source and target graphs, which induce distribution shifts in node embeddings. It introduces Subgraph Pooling (SP) and Subgraph Pooling++ (SP++) to transfer subgraph‑level information by sampling ‑hop neighborhoods or random walks, thereby reducing the graph discrepancy measured by CMD and improving transfer performance. The authors provide theoretical analysis showing how SP reduces discrepancy and demonstrate, across diverse datasets and transfer settings, that SP/SP++ consistently outperform strong baselines with no extra learnable parameters and compatibility with any GNN backbone. The work offers a practical, efficient approach to robust graph transfer learning with broad applicability to evolving graph structures in real‑world tasks, and it is accompanied by public code and data.

Abstract

Transfer learning aims to enhance performance on a target task by using knowledge from related tasks. However, when the source and target tasks are not closely aligned, it can lead to reduced performance, known as negative transfer. Unlike in image or text data, we find that negative transfer could commonly occur in graph-structured data, even when source and target graphs have semantic similarities. Specifically, we identify that structural differences significantly amplify the dissimilarities in the node embeddings across graphs. To mitigate this, we bring a new insight in this paper: for semantically similar graphs, although structural differences lead to significant distribution shift in node embeddings, their impact on subgraph embeddings could be marginal. Building on this insight, we introduce Subgraph Pooling (SP) by aggregating nodes sampled from a k-hop neighborhood and Subgraph Pooling++ (SP++) by a random walk, to mitigate the impact of graph structural differences on knowledge transfer. We theoretically analyze the role of SP in reducing graph discrepancy and conduct extensive experiments to evaluate its superiority under various settings. The proposed SP methods are effective yet elegant, which can be easily applied on top of any backbone Graph Neural Networks (GNNs). Our code and data are available at: https://github.com/Zehong-Wang/Subgraph-Pooling.
Paper Structure (32 sections, 3 theorems, 16 equations, 7 figures, 9 tables)

This paper contains 32 sections, 3 theorems, 16 equations, 7 figures, 9 tables.

Key Result

Theorem 1

For node $u \in \mathcal{V}^s$ in the source graph and $v \in \mathcal{V}^t$ in the target graph, considering the MEAN pooling function, the subgraph embeddings are $\mathbf{h}_u = \frac{\mathbf{z}_u + \sum_{i \in \mathcal{N}_s(u)} \mathbf{z}_i}{n + 1}$, $\mathbf{h}_v = \frac{\mathbf{z}_v + \sum_{j where $\Delta = \frac{(n \| \mathbf{z}_u - \mathbf{z}_v \| - \frac{m - n}{m+1} \| \mathbf{z}_v \|)}

Figures (7)

  • Figure 1: Structural differences between the source (DBLP) and target (ACM) amplify the distribution shift on nodes embeddings. Left: We illustrate the discrepancy (CMD value) between node embeddings of the source and target during pre-training, and compare the performance of direct training on the target (gray) and transferring knowledge from the source to the target (blue). A large discrepancy results in negative transfer. Right: We introduce structural noise in the target graph through random edge permutation. Even minor permutations can enlarge the discrepancy (and thus aggravate negative transfer) in vanilla GCN, yet our method effectively mitigates the issue.
  • Figure 2: Subgraph Pooling++ (SP++) mitigates the risk of over-smoothing derived from a large pooling kernel. We conduct transfer learning from ACM to DBLP. Left: Illustration of the subgraph embeddings on the target graph with $k=5$, where SP++ leveraging RW sampler has a clearer boundary. Right: Transfer learning performance during pre-training and fine-tuning, where SP++ achieves better.
  • Figure 3: Node classification performance on Twitch.
  • Figure 4: Node classification performance on Elliptic.
  • Figure 5: Ablation on Citation with different pooling functions.
  • ...and 2 more figures

Theorems & Definitions (12)

  • Definition 1: Semi-supervised Graph Transfer Learning
  • Definition 2: Node-level Discrepancy
  • Definition 3: Subgraph-level Discrepancy
  • Remark 1
  • Theorem 1
  • Corollary 1
  • Corollary 2
  • Remark 2
  • Example 1
  • proof
  • ...and 2 more