Graph Structure and Feature Extrapolation for Out-of-Distribution Generalization
Xiner Li, Shurui Gui, Youzhi Luo, Shuiwang Ji
TL;DR
This work tackles graph out-of-distribution generalization by introducing environment-aware non-Euclidean linear extrapolation to synthesize OOD data. It presents G-Splice for structural extrapolation via graph splicing with a bridge generator and FeatX for feature extrapolation via learned non-causal feature perturbations, underpinned by causal-additivity theory. Empirical results across multiple structure-shift and feature-shift benchmarks show that G-Splice and FeatX substantially improve OOD performance, with their combination achieving strong results on complex shifts such as FSMotif. The approach emphasizes the value of explicitly generating OOD graphs and features with environment information, and points to directions for future work including non-linear extrapolation and improved environment discovery.
Abstract
Out-of-distribution (OOD) generalization deals with the prevalent learning scenario where test distribution shifts from training distribution. With rising application demands and inherent complexity, graph OOD problems call for specialized solutions. While data-centric methods exhibit performance enhancements on many generic machine learning tasks, there is a notable absence of data augmentation methods tailored for graph OOD generalization. In this work, we propose to achieve graph OOD generalization with the novel design of non-Euclidean-space linear extrapolation. The proposed augmentation strategy extrapolates both structure and feature spaces to generate OOD graph data. Our design tailors OOD samples for specific shifts without corrupting underlying causal mechanisms. Theoretical analysis and empirical results evidence the effectiveness of our method in solving target shifts, showing substantial and constant improvements across various graph OOD tasks.
