Graph Out-of-Distribution Generalization via Causal Intervention

Qitian Wu; Fan Nie; Chenxiao Yang; Tianyi Bao; Junchi Yan

Graph Out-of-Distribution Generalization via Causal Intervention

Qitian Wu, Fan Nie, Chenxiao Yang, Tianyi Bao, Junchi Yan

TL;DR

The paper addresses the challenge of graph out-of-distribution generalization by uncovering latent environment confounding as a primary cause of GNN failure under distribution shifts. It introduces CaNet, a causal-intervention framework that learns environment-insensitive predictive relations by jointly training an environment estimator and a mixture-of-experts GNN predictor, guided by a variational objective that approximates p_θ(Ŷ|do(G)). The method does not require environment labels and leverages layer-wise pseudo environments to regularize learning via backdoor adjustment, achieving substantial improvements on six graph datasets (up to 27.4% in some OOD settings) while maintaining ID performance. Overall, CaNet offers a principled, data-driven approach to improve graph OOD generalization, with implications for robustness in real-world graph applications and potential extensions to graph Transformers and domain-specific tasks.

Abstract

Out-of-distribution (OOD) generalization has gained increasing attentions for learning on graphs, as graph neural networks (GNNs) often exhibit performance degradation with distribution shifts. The challenge is that distribution shifts on graphs involve intricate interconnections between nodes, and the environment labels are often absent in data. In this paper, we adopt a bottom-up data-generative perspective and reveal a key observation through causal analysis: the crux of GNNs' failure in OOD generalization lies in the latent confounding bias from the environment. The latter misguides the model to leverage environment-sensitive correlations between ego-graph features and target nodes' labels, resulting in undesirable generalization on new unseen nodes. Built upon this analysis, we introduce a conceptually simple yet principled approach for training robust GNNs under node-level distribution shifts, without prior knowledge of environment labels. Our method resorts to a new learning objective derived from causal inference that coordinates an environment estimator and a mixture-of-expert GNN predictor. The new approach can counteract the confounding bias in training data and facilitate learning generalizable predictive relations. Extensive experiment demonstrates that our model can effectively enhance generalization with various types of distribution shifts and yield up to 27.4\% accuracy improvement over state-of-the-arts on graph OOD generalization benchmarks. Source codes are available at https://github.com/fannie1208/CaNet.

Graph Out-of-Distribution Generalization via Causal Intervention

TL;DR

Abstract

Paper Structure (21 sections, 17 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 21 sections, 17 equations, 10 figures, 3 tables, 1 algorithm.

Introduction
Related Works
Problem Formulation
Proposed Model
Causal Analysis for Graph Neural Networks
Model Formulation: A Causal Treatment
Model Instantiations
Experiments
Experiment Setup
Comparative Results (R1)
Ablation Studies (R2)
Hyper-parameter Analysis (R3)
Visualization (R4)
Conclusion
Proofs and Derivations
...and 6 more sections

Figures (10)

Figure 1: An illustration based on a social network example for solving node property prediction by GNNs. (a) The task aims at predicting the (target) node's label $y_v$ based on its ego-graph features $\mathcal{G}_v$, i.e., what the GNN model processes as input. (b) Two relations existing in social networks trigger different generalization effects for GNNs trained with observed data. (c) A causal graph describing the dependence among ego-graph features $G$, node label $Y$ and the unobserved environment $E$. The latter is a latent confounder, the common cause for $G$ and $Y$ in the data generation.
Figure 2: Structural Causal Models and data pipelines for (a) standard GNNs' learning process and (b) our proposed approach CaNet's. The training of common GNNs is affected by the latent confounder $E$ that misguides the model to rely on environment-insensitive correlation between $G$ and $Y$ and leads to unsatisfactory OOD generalization. In contrast, our approach resorts to a new learning objective that essentially cuts off the dependence between $E$ and $G$.
Figure 3: Illustration for the proposed model CaNet whose layer-wise computation entails a layer-specific environment estimator and a special feature propagation layer conditioned on the inferred pseudo environment.
Figure 4: Macro F1 score on eight testing sets (by chronologically grouping the testing snapshots) of Elliptic.
Figure 5: Ablation studies. (a) Learning curves on Cora w/ and w/o regularization loss. (b) Ablation results on Arxiv.
...and 5 more figures

Graph Out-of-Distribution Generalization via Causal Intervention

TL;DR

Abstract

Graph Out-of-Distribution Generalization via Causal Intervention

Authors

TL;DR

Abstract

Table of Contents

Figures (10)