Pairwise Alignment Improves Graph Domain Adaptation
Shikun Liu, Deyu Zou, Han Zhao, Pan Li
TL;DR
This paper tackles graph domain adaptation under distribution shifts that affect features, labels, and graph structure. It introduces Pairwise Alignment (Pair-Align), which simultaneously addresses conditional structure shift (CSS) via edge-weight reweighting with $\boldsymbol{\gamma}$ and label shift (LS) via label-weighted classification loss with $\boldsymbol{\beta}$, supported by robust, iterative estimation of $\mathbf{w}$ and $\boldsymbol{\alpha}$. The approach decomposes structure shift into CSS and LS, provides bootstrap-based multiset alignment of neighbor label distributions, and demonstrates substantial improvements across real-world datasets (including MAG with region-based splits and pileup mitigation) and synthetic CSBM benchmarks. The results indicate that conditional alignment on the neighborhood, when combined with label distribution balancing, yields strong generalization in graph-domain transfer tasks and offers a practical path toward robust GNN-based inference under complex graph shifts.
Abstract
Graph-based methods, pivotal for label inference over interconnected objects in many real-world applications, often encounter generalization challenges, if the graph used for model training differs significantly from the graph used for testing. This work delves into Graph Domain Adaptation (GDA) to address the unique complexities of distribution shifts over graph data, where interconnected data points experience shifts in features, labels, and in particular, connecting patterns. We propose a novel, theoretically principled method, Pairwise Alignment (Pair-Align) to counter graph structure shift by mitigating conditional structure shift (CSS) and label shift (LS). Pair-Align uses edge weights to recalibrate the influence among neighboring nodes to handle CSS and adjusts the classification loss with label weights to handle LS. Our method demonstrates superior performance in real-world applications, including node classification with region shift in social networks, and the pileup mitigation task in particle colliding experiments. For the first application, we also curate the largest dataset by far for GDA studies. Our method shows strong performance in synthetic and other existing benchmark datasets.
