Table of Contents
Fetching ...

Pairwise Alignment Improves Graph Domain Adaptation

Shikun Liu, Deyu Zou, Han Zhao, Pan Li

TL;DR

This paper tackles graph domain adaptation under distribution shifts that affect features, labels, and graph structure. It introduces Pairwise Alignment (Pair-Align), which simultaneously addresses conditional structure shift (CSS) via edge-weight reweighting with $\boldsymbol{\gamma}$ and label shift (LS) via label-weighted classification loss with $\boldsymbol{\beta}$, supported by robust, iterative estimation of $\mathbf{w}$ and $\boldsymbol{\alpha}$. The approach decomposes structure shift into CSS and LS, provides bootstrap-based multiset alignment of neighbor label distributions, and demonstrates substantial improvements across real-world datasets (including MAG with region-based splits and pileup mitigation) and synthetic CSBM benchmarks. The results indicate that conditional alignment on the neighborhood, when combined with label distribution balancing, yields strong generalization in graph-domain transfer tasks and offers a practical path toward robust GNN-based inference under complex graph shifts.

Abstract

Graph-based methods, pivotal for label inference over interconnected objects in many real-world applications, often encounter generalization challenges, if the graph used for model training differs significantly from the graph used for testing. This work delves into Graph Domain Adaptation (GDA) to address the unique complexities of distribution shifts over graph data, where interconnected data points experience shifts in features, labels, and in particular, connecting patterns. We propose a novel, theoretically principled method, Pairwise Alignment (Pair-Align) to counter graph structure shift by mitigating conditional structure shift (CSS) and label shift (LS). Pair-Align uses edge weights to recalibrate the influence among neighboring nodes to handle CSS and adjusts the classification loss with label weights to handle LS. Our method demonstrates superior performance in real-world applications, including node classification with region shift in social networks, and the pileup mitigation task in particle colliding experiments. For the first application, we also curate the largest dataset by far for GDA studies. Our method shows strong performance in synthetic and other existing benchmark datasets.

Pairwise Alignment Improves Graph Domain Adaptation

TL;DR

This paper tackles graph domain adaptation under distribution shifts that affect features, labels, and graph structure. It introduces Pairwise Alignment (Pair-Align), which simultaneously addresses conditional structure shift (CSS) via edge-weight reweighting with and label shift (LS) via label-weighted classification loss with , supported by robust, iterative estimation of and . The approach decomposes structure shift into CSS and LS, provides bootstrap-based multiset alignment of neighbor label distributions, and demonstrates substantial improvements across real-world datasets (including MAG with region-based splits and pileup mitigation) and synthetic CSBM benchmarks. The results indicate that conditional alignment on the neighborhood, when combined with label distribution balancing, yields strong generalization in graph-domain transfer tasks and offers a practical path toward robust GNN-based inference under complex graph shifts.

Abstract

Graph-based methods, pivotal for label inference over interconnected objects in many real-world applications, often encounter generalization challenges, if the graph used for model training differs significantly from the graph used for testing. This work delves into Graph Domain Adaptation (GDA) to address the unique complexities of distribution shifts over graph data, where interconnected data points experience shifts in features, labels, and in particular, connecting patterns. We propose a novel, theoretically principled method, Pairwise Alignment (Pair-Align) to counter graph structure shift by mitigating conditional structure shift (CSS) and label shift (LS). Pair-Align uses edge weights to recalibrate the influence among neighboring nodes to handle CSS and adjusts the classification loss with label weights to handle LS. Our method demonstrates superior performance in real-world applications, including node classification with region shift in social networks, and the pileup mitigation task in particle colliding experiments. For the first application, we also curate the largest dataset by far for GDA studies. Our method shows strong performance in synthetic and other existing benchmark datasets.
Paper Structure (35 sections, 7 theorems, 30 equations, 2 figures, 11 tables, 1 algorithm)

This paper contains 35 sections, 7 theorems, 30 equations, 2 figures, 11 tables, 1 algorithm.

Key Result

Proposition 2.1

liu2023structural Suppose the source and target graphs are generated from the CSBM model of $n$ nodes with the same label distributions and node feature distributions. The edge connection probabilities are set to present a conditional structure shift $\mathbb{P}_\mathcal{S}(\mathbf{A}|\mathbf{Y}) \n

Figures (2)

  • Figure 1: We illustrate structure shifts in real-world datasets: a) The HEP dataset in pileup mitigation tasks bertolini2014pileup has a shift in PU levels (change in the number of other collisions (OC) around the leading collision (LC) for proton-proton collision events), where $\mathcal{G}_\mathcal{S}$ is in PU30 and $\mathcal{G}_\mathcal{T}$ is in PU10; Here, in the green circles, the center nodes in grey are the particles whose labels are to be inferred. They have different ground-truth labels but the same neighborhood that includes one OC and one LC particle.b) The citation MAG dataset shifts in regions, where the source graph contains papers in the US and the target graph contains papers in German. More statistics on graph distribution shift from real-world examples can be found in Appendix \ref{['app:shift_stats']}.
  • Figure 2: The pipeline contains modules in handling CSS with edge weights $\boldsymbol{\gamma}$ and handling LS with label weights $\boldsymbol{\beta}$

Theorems & Definitions (20)

  • Proposition 2.1
  • Definition 3.1: Feature Shift
  • Definition 3.2: Structure Shift
  • Theorem 3.3: Sufficient conditions for addressing CSS
  • Remark 3.4
  • Definition 3.5
  • Definition 3.6
  • Definition 3.7
  • Lemma 3.7
  • Definition 3.8
  • ...and 10 more