IENE: Identifying and Extrapolating the Node Environment for Out-of-Distribution Generalization on Graphs

Haoran Yang; Xiaobing Pei; Kai Yuan

IENE: Identifying and Extrapolating the Node Environment for Out-of-Distribution Generalization on Graphs

Haoran Yang, Xiaobing Pei, Kai Yuan

TL;DR

IENE, an OOD generalization method on graphs based on node-level environmental identification and extrapolation techniques, is proposed, which outperforms existing techniques and provides a flexible framework for enhancing the generalization of GNNs.

Abstract

Due to the performance degradation of graph neural networks (GNNs) under distribution shifts, the work on out-of-distribution (OOD) generalization on graphs has received widespread attention. A novel perspective involves distinguishing potential confounding biases from different environments through environmental identification, enabling the model to escape environmentally-sensitive correlations and maintain stable performance under distribution shifts. However, in graph data, confounding factors not only affect the generation process of node features but also influence the complex interaction between nodes. We observe that neglecting either aspect of them will lead to a decrease in performance. In this paper, we propose IENE, an OOD generalization method on graphs based on node-level environmental identification and extrapolation techniques. It strengthens the model's ability to extract invariance from two granularities simultaneously, leading to improved generalization. Specifically, to identify invariance in features, we utilize the disentangled information bottleneck framework to achieve mutual promotion between node-level environmental estimation and invariant feature learning. Furthermore, we extrapolate topological environments through graph augmentation techniques to identify structural invariance. We implement the conceptual method with specific algorithms and provide theoretical analysis and proofs for our approach. Extensive experimental evaluations on two synthetic and four real-world OOD datasets validate the superiority of IENE, which outperforms existing techniques and provides a flexible framework for enhancing the generalization of GNNs.

IENE: Identifying and Extrapolating the Node Environment for Out-of-Distribution Generalization on Graphs

TL;DR

Abstract

Paper Structure (46 sections, 4 theorems, 22 equations, 3 figures, 14 tables, 1 algorithm)

This paper contains 46 sections, 4 theorems, 22 equations, 3 figures, 14 tables, 1 algorithm.

Introduction
Preliminaries
Node-Level Environment Extrapolation Problem for OOD Generalization
Environment Identification Problem
Method
Identifying Invariance in features through Environment Partitioning
Identifying Invariance in structures through Environment Extrapolation
Theoretical Analysis
Why IENE can identify invariant features
Under what circumstances can IENE identify invariant features
Algorithm
Identifying Invariant Features through Environment Partitioning
Identifying Invariant Features through Environment Extrapolation
Experiments
Out-of-Distribution Datasets
...and 31 more sections

Key Result

Theorem 1

Based on Assumptions assumption2-assumption4 and Conditions condition1-condition2, if there exists $\epsilon <\frac{C\gamma \delta }{4\gamma +2C\delta H(Y)}$ and $\lambda \in [\frac{H(Y)+1/2\delta C }{\delta C-4\epsilon }-\frac{1}{2},\frac{\gamma }{4\epsilon }-\frac{1}{2}$, let $\Phi^*$ be the solut

Figures (3)

Figure 1: An example of causal structural model with certain general structural patterns, where $X_1,X_2$ are invariant features, $X_3,X_4$ are spurious features, $Y$ is the target, and $e$ is the environmental factor.
Figure 2: Results on Cora under OOD. Compared to ERM, IENE significantly improves the accuracy and stability of GNN.
Figure 3: Ablation experiments on the value of parameter $\lambda$.

Theorems & Definitions (4)

Theorem 1
Theorem 2
Proposition 1
Lemma 1

IENE: Identifying and Extrapolating the Node Environment for Out-of-Distribution Generalization on Graphs

TL;DR

Abstract

IENE: Identifying and Extrapolating the Node Environment for Out-of-Distribution Generalization on Graphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (4)