Table of Contents
Fetching ...

Generative Risk Minimization for Out-of-Distribution Generalization on Graphs

Song Wang, Zhen Tan, Yaochen Zhu, Chuxu Zhang, Jundong Li

TL;DR

This work tackles out-of-distribution generalization on graphs by shifting from invariant-subgraph extraction to generative invariant subgraphs. It introduces Generative Risk Minimization (GRM), which uses a VGAE-based generator to produce a continuous invariant subgraph for each input graph, guided by a three-term objective that maximizes the subgraph’s predictive power while minimizing dependence on environment and spurious information through a latent causal variable $Z$. A theoretical ELBO-based lower bound enables end-to-end optimization without ground-truth invariant subgraphs, and extensive node- and graph-level experiments show GRM consistently surpasses state-of-the-art baselines under various distribution shifts. The approach demonstrates strong generalization, robustness to spurious features, and applicability to both node and graph classification tasks, with potential impact on real-world graph learning under shifting environments.

Abstract

Out-of-distribution (OOD) generalization on graphs aims at dealing with scenarios where the test graph distribution differs from the training graph distributions. Compared to i.i.d. data like images, the OOD generalization problem on graph-structured data remains challenging due to the non-i.i.d. property and complex structural information on graphs. Recently, several works on graph OOD generalization have explored extracting invariant subgraphs that share crucial classification information across different distributions. Nevertheless, such a strategy could be suboptimal for entirely capturing the invariant information, as the extraction of discrete structures could potentially lead to the loss of invariant information or the involvement of spurious information. In this paper, we propose an innovative framework, named Generative Risk Minimization (GRM), designed to generate an invariant subgraph for each input graph to be classified, instead of extraction. To address the challenge of optimization in the absence of optimal invariant subgraphs (i.e., ground truths), we derive a tractable form of the proposed GRM objective by introducing a latent causal variable, and its effectiveness is validated by our theoretical analysis. We further conduct extensive experiments across a variety of real-world graph datasets for both node-level and graph-level OOD generalization, and the results demonstrate the superiority of our framework GRM.

Generative Risk Minimization for Out-of-Distribution Generalization on Graphs

TL;DR

This work tackles out-of-distribution generalization on graphs by shifting from invariant-subgraph extraction to generative invariant subgraphs. It introduces Generative Risk Minimization (GRM), which uses a VGAE-based generator to produce a continuous invariant subgraph for each input graph, guided by a three-term objective that maximizes the subgraph’s predictive power while minimizing dependence on environment and spurious information through a latent causal variable . A theoretical ELBO-based lower bound enables end-to-end optimization without ground-truth invariant subgraphs, and extensive node- and graph-level experiments show GRM consistently surpasses state-of-the-art baselines under various distribution shifts. The approach demonstrates strong generalization, robustness to spurious features, and applicability to both node and graph classification tasks, with potential impact on real-world graph learning under shifting environments.

Abstract

Out-of-distribution (OOD) generalization on graphs aims at dealing with scenarios where the test graph distribution differs from the training graph distributions. Compared to i.i.d. data like images, the OOD generalization problem on graph-structured data remains challenging due to the non-i.i.d. property and complex structural information on graphs. Recently, several works on graph OOD generalization have explored extracting invariant subgraphs that share crucial classification information across different distributions. Nevertheless, such a strategy could be suboptimal for entirely capturing the invariant information, as the extraction of discrete structures could potentially lead to the loss of invariant information or the involvement of spurious information. In this paper, we propose an innovative framework, named Generative Risk Minimization (GRM), designed to generate an invariant subgraph for each input graph to be classified, instead of extraction. To address the challenge of optimization in the absence of optimal invariant subgraphs (i.e., ground truths), we derive a tractable form of the proposed GRM objective by introducing a latent causal variable, and its effectiveness is validated by our theoretical analysis. We further conduct extensive experiments across a variety of real-world graph datasets for both node-level and graph-level OOD generalization, and the results demonstrate the superiority of our framework GRM.

Paper Structure

This paper contains 31 sections, 1 theorem, 22 equations, 4 figures, 9 tables.

Key Result

Theorem 3.1

An evidence lower bound (ELBO) for optimization of the GRM objective, by introducing a latent causal variable $Z$ and variational approximations $Q(Z)$ and $Q(\widehat{G}_c)$, is as follows:

Figures (4)

  • Figure 1: The SCMs with distribution shift (left) and without distribution shifts (right).
  • Figure 2: The overall framework of GRM. Each input graph $G$ is processed by the encoder of our generator to learn the latent variable $Z$. Then we extract the most influential nodes from the domain and learn a domain-specific representation for each node in $G$. These domain-specific representations will be used in the invariance loss. We further classify the output invariant subgraph with a classifier to obtain the predictions. The regularization loss is calculated for $Z$ and the invariant subgraph.
  • Figure 3: The results of various methods on dataset Cora-Mix with different degrees of distribution shifts.
  • Figure 4: Ablation study of our framework GRM with different variants evaluated on six real-world datasets.

Theorems & Definitions (2)

  • Theorem 3.1
  • proof