Generative Risk Minimization for Out-of-Distribution Generalization on Graphs
Song Wang, Zhen Tan, Yaochen Zhu, Chuxu Zhang, Jundong Li
TL;DR
This work tackles out-of-distribution generalization on graphs by shifting from invariant-subgraph extraction to generative invariant subgraphs. It introduces Generative Risk Minimization (GRM), which uses a VGAE-based generator to produce a continuous invariant subgraph for each input graph, guided by a three-term objective that maximizes the subgraph’s predictive power while minimizing dependence on environment and spurious information through a latent causal variable $Z$. A theoretical ELBO-based lower bound enables end-to-end optimization without ground-truth invariant subgraphs, and extensive node- and graph-level experiments show GRM consistently surpasses state-of-the-art baselines under various distribution shifts. The approach demonstrates strong generalization, robustness to spurious features, and applicability to both node and graph classification tasks, with potential impact on real-world graph learning under shifting environments.
Abstract
Out-of-distribution (OOD) generalization on graphs aims at dealing with scenarios where the test graph distribution differs from the training graph distributions. Compared to i.i.d. data like images, the OOD generalization problem on graph-structured data remains challenging due to the non-i.i.d. property and complex structural information on graphs. Recently, several works on graph OOD generalization have explored extracting invariant subgraphs that share crucial classification information across different distributions. Nevertheless, such a strategy could be suboptimal for entirely capturing the invariant information, as the extraction of discrete structures could potentially lead to the loss of invariant information or the involvement of spurious information. In this paper, we propose an innovative framework, named Generative Risk Minimization (GRM), designed to generate an invariant subgraph for each input graph to be classified, instead of extraction. To address the challenge of optimization in the absence of optimal invariant subgraphs (i.e., ground truths), we derive a tractable form of the proposed GRM objective by introducing a latent causal variable, and its effectiveness is validated by our theoretical analysis. We further conduct extensive experiments across a variety of real-world graph datasets for both node-level and graph-level OOD generalization, and the results demonstrate the superiority of our framework GRM.
