Table of Contents
Fetching ...

Cooperative Classification and Rationalization for Graph Generalization

Linan Yue, Qi Liu, Ye Liu, Weibo Gao, Fangzhou Yao, Wenfeng Li

TL;DR

This paper proposes a Cooperative Classification and Rationalization (C2R) method, consisting of the classification and therationalization module, which introduces diverse training distributions using an environment-conditional generative network, enabling robust graph representations.

Abstract

Graph Neural Networks (GNNs) have achieved impressive results in graph classification tasks, but they struggle to generalize effectively when faced with out-of-distribution (OOD) data. Several approaches have been proposed to address this problem. Among them, one solution is to diversify training distributions in vanilla classification by modifying the data environment, yet accessing the environment information is complex. Besides, another promising approach involves rationalization, extracting invariant rationales for predictions. However, extracting rationales is difficult due to limited learning signals, resulting in less accurate rationales and diminished predictions. To address these challenges, in this paper, we propose a Cooperative Classification and Rationalization (C2R) method, consisting of the classification and the rationalization module. Specifically, we first assume that multiple environments are available in the classification module. Then, we introduce diverse training distributions using an environment-conditional generative network, enabling robust graph representations. Meanwhile, the rationalization module employs a separator to identify relevant rationale subgraphs while the remaining non-rationale subgraphs are de-correlated with labels. Next, we align graph representations from the classification module with rationale subgraph representations using the knowledge distillation methods, enhancing the learning signal for rationales. Finally, we infer multiple environments by gathering non-rationale representations and incorporate them into the classification module for cooperative learning. Extensive experimental results on both benchmarks and synthetic datasets demonstrate the effectiveness of C2R. Code is available at https://github.com/yuelinan/Codes-of-C2R.

Cooperative Classification and Rationalization for Graph Generalization

TL;DR

This paper proposes a Cooperative Classification and Rationalization (C2R) method, consisting of the classification and therationalization module, which introduces diverse training distributions using an environment-conditional generative network, enabling robust graph representations.

Abstract

Graph Neural Networks (GNNs) have achieved impressive results in graph classification tasks, but they struggle to generalize effectively when faced with out-of-distribution (OOD) data. Several approaches have been proposed to address this problem. Among them, one solution is to diversify training distributions in vanilla classification by modifying the data environment, yet accessing the environment information is complex. Besides, another promising approach involves rationalization, extracting invariant rationales for predictions. However, extracting rationales is difficult due to limited learning signals, resulting in less accurate rationales and diminished predictions. To address these challenges, in this paper, we propose a Cooperative Classification and Rationalization (C2R) method, consisting of the classification and the rationalization module. Specifically, we first assume that multiple environments are available in the classification module. Then, we introduce diverse training distributions using an environment-conditional generative network, enabling robust graph representations. Meanwhile, the rationalization module employs a separator to identify relevant rationale subgraphs while the remaining non-rationale subgraphs are de-correlated with labels. Next, we align graph representations from the classification module with rationale subgraph representations using the knowledge distillation methods, enhancing the learning signal for rationales. Finally, we infer multiple environments by gathering non-rationale representations and incorporate them into the classification module for cooperative learning. Extensive experimental results on both benchmarks and synthetic datasets demonstrate the effectiveness of C2R. Code is available at https://github.com/yuelinan/Codes-of-C2R.
Paper Structure (25 sections, 15 equations, 6 figures, 1 table)

This paper contains 25 sections, 15 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: An example of the motif type prediction, where the Houseand Cycleare motif labels, and Treeand Wheelare base subgraphs. Within the training set, there is a substantial disparity in the occurrence of House-Tree graphs compared to Cycle-Wheel graphs. This means that the number of House-Tree graphs ($\mathcal{N}$) greatly exceeds the number of Cycle-Wheel graphs ($\mathcal{K}$). Consequently, GNNs trained on such imbalanced data distributions tend to exhibit higher accuracy when handling in-distribution data, specifically House-Tree graphs. However, these models are more susceptible to making errors when faced with out-of-distribution (OOD) data, such as Cycle-Tree graphs.
  • Figure 2: Architecture of C2R, including the classification and rationalization modules.
  • Figure 4: Ablation study and Hyperparameter Sensitivity Analysis of C2R which is implemented with GIN over OGB.
  • Figure 5: Hyperparameter Sensitivity Analysis of the number of inductive environments $k$.
  • Figure 6: Training process of C2R on MolSIDER.
  • ...and 1 more figures