Table of Contents
Fetching ...

Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs

Yuhan Chen, Yihong Luo, Yifan Song, Pengwen Dai, Jing Tang, Xiaochun Cao

TL;DR

This work tackles node-level OOD detection on graphs, a challenging problem due to node interdependence and heterophily. It introduces DeGEM, a decoupled energy-based framework that learns a graph encoder via Graph Contrastive Learning and trains a latent-space energy head with Maximum Likelihood Estimation, avoiding energy propagation and adjacency-sampling. The approach, enhanced by a Multi-Hop encoder, Conditional Energy, Energy Readout, and a Recurrent Update, achieves state-of-the-art AUROC on both homophilic and heterophilic graphs without OOD exposure and remains robust under limited labels. By moving MCMC sampling into the latent space, DeGEM delivers scalable, data-distribution-aware OOD detection with strong practical impact for graph-structured data.

Abstract

Despite extensive research efforts focused on OOD detection on images, OOD detection on nodes in graph learning remains underexplored. The dependence among graph nodes hinders the trivial adaptation of existing approaches on images that assume inputs to be i.i.d. sampled, since many unique features and challenges specific to graphs are not considered, such as the heterophily issue. Recently, GNNSafe, which considers node dependence, adapted energy-based detection to the graph domain with state-of-the-art performance, however, it has two serious issues: 1) it derives node energy from classification logits without specifically tailored training for modeling data distribution, making it less effective at recognizing OOD data; 2) it highly relies on energy propagation, which is based on homophily assumption and will cause significant performance degradation on heterophilic graphs, where the node tends to have dissimilar distribution with its neighbors. To address the above issues, we suggest training EBMs by MLE to enhance data distribution modeling and remove energy propagation to overcome the heterophily issues. However, training EBMs via MLE requires performing MCMC sampling on both node feature and node neighbors, which is challenging due to the node interdependence and discrete graph topology. To tackle the sampling challenge, we introduce DeGEM, which decomposes the learning process into two parts: a graph encoder that leverages topology information for node representations and an energy head that operates in latent space. Extensive experiments validate that DeGEM, without OOD exposure during training, surpasses previous state-of-the-art methods, achieving an average AUROC improvement of 6.71% on homophilic graphs and 20.29% on heterophilic graphs, and even outperform methods trained with OOD exposure. Our code is available at: https://github.com/draym28/DeGEM.

Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs

TL;DR

This work tackles node-level OOD detection on graphs, a challenging problem due to node interdependence and heterophily. It introduces DeGEM, a decoupled energy-based framework that learns a graph encoder via Graph Contrastive Learning and trains a latent-space energy head with Maximum Likelihood Estimation, avoiding energy propagation and adjacency-sampling. The approach, enhanced by a Multi-Hop encoder, Conditional Energy, Energy Readout, and a Recurrent Update, achieves state-of-the-art AUROC on both homophilic and heterophilic graphs without OOD exposure and remains robust under limited labels. By moving MCMC sampling into the latent space, DeGEM delivers scalable, data-distribution-aware OOD detection with strong practical impact for graph-structured data.

Abstract

Despite extensive research efforts focused on OOD detection on images, OOD detection on nodes in graph learning remains underexplored. The dependence among graph nodes hinders the trivial adaptation of existing approaches on images that assume inputs to be i.i.d. sampled, since many unique features and challenges specific to graphs are not considered, such as the heterophily issue. Recently, GNNSafe, which considers node dependence, adapted energy-based detection to the graph domain with state-of-the-art performance, however, it has two serious issues: 1) it derives node energy from classification logits without specifically tailored training for modeling data distribution, making it less effective at recognizing OOD data; 2) it highly relies on energy propagation, which is based on homophily assumption and will cause significant performance degradation on heterophilic graphs, where the node tends to have dissimilar distribution with its neighbors. To address the above issues, we suggest training EBMs by MLE to enhance data distribution modeling and remove energy propagation to overcome the heterophily issues. However, training EBMs via MLE requires performing MCMC sampling on both node feature and node neighbors, which is challenging due to the node interdependence and discrete graph topology. To tackle the sampling challenge, we introduce DeGEM, which decomposes the learning process into two parts: a graph encoder that leverages topology information for node representations and an energy head that operates in latent space. Extensive experiments validate that DeGEM, without OOD exposure during training, surpasses previous state-of-the-art methods, achieving an average AUROC improvement of 6.71% on homophilic graphs and 20.29% on heterophilic graphs, and even outperform methods trained with OOD exposure. Our code is available at: https://github.com/draym28/DeGEM.

Paper Structure

This paper contains 30 sections, 19 equations, 4 figures, 14 tables, 1 algorithm.

Figures (4)

  • Figure 1: AUROC across graphs.
  • Figure 2: A & B The detailed graph contrastive learning and EBM training process. The readout of original node representations in A participates in the Conditional Energy (CE) in B, and the original node energies in B are delivered to A for Energy Readout (ERo). We propose a Recurrent Update mechanism to jointly train the CE and ERo effectively. C The comparison between traditional MCMC sampling and our proposed MCMC sampling.
  • Figure 3: Visualization comparison on 2D data.
  • Figure 4: Performance across labeled ratios.