Table of Contents
Fetching ...

Structural Entropy Guided Unsupervised Graph Out-Of-Distribution Detection

Yue Hou, He Zhu, Ruomei Liu, Yingke Su, Jinxiang Xia, Junran Wu, Ke Xu

TL;DR

This work tackles unsupervised graph-level OOD detection by minimizing structural entropy to extract essential graph information and remove redundancy. It proposes SEGO, a framework that builds a coding-tree anchor as the redundancy-eliminated view and uses a multi-grained, triplet-view contrastive learning scheme (basic graph, anchor tree, and topological views) to maximize agreement across local, global, and tree levels. The authors prove that anchor-view minimization preserves maximum mutual information with ground-truth targets and empirically demonstrate that SEGO outperforms 14 baselines across 10 dataset pairs, with an average improvement of 3.7% in OOD detection and notable gains in anomaly detection. The approach offers a principled, structure-aware alternative to augmentation-heavy graph self-supervised methods and demonstrates practical impact for reliable graph-based systems.

Abstract

With the emerging of huge amount of unlabeled data, unsupervised out-of-distribution (OOD) detection is vital for ensuring the reliability of graph neural networks (GNNs) by identifying OOD samples from in-distribution (ID) ones during testing, where encountering novel or unknown data is inevitable. Existing methods often suffer from compromised performance due to redundant information in graph structures, which impairs their ability to effectively differentiate between ID and OOD data. To address this challenge, we propose SEGO, an unsupervised framework that integrates structural entropy into OOD detection regarding graph classification. Specifically, within the architecture of contrastive learning, SEGO introduces an anchor view in the form of coding tree by minimizing structural entropy. The obtained coding tree effectively removes redundant information from graphs while preserving essential structural information, enabling the capture of distinct graph patterns between ID and OOD samples. Furthermore, we present a multi-grained contrastive learning scheme at local, global, and tree levels using triplet views, where coding trees with essential information serve as the anchor view. Extensive experiments on real-world datasets validate the effectiveness of SEGO, demonstrating superior performance over state-of-the-art baselines in OOD detection. Specifically, our method achieves the best performance on 9 out of 10 dataset pairs, with an average improvement of 3.7\% on OOD detection datasets, significantly surpassing the best competitor by 10.8\% on the FreeSolv/ToxCast dataset pair.

Structural Entropy Guided Unsupervised Graph Out-Of-Distribution Detection

TL;DR

This work tackles unsupervised graph-level OOD detection by minimizing structural entropy to extract essential graph information and remove redundancy. It proposes SEGO, a framework that builds a coding-tree anchor as the redundancy-eliminated view and uses a multi-grained, triplet-view contrastive learning scheme (basic graph, anchor tree, and topological views) to maximize agreement across local, global, and tree levels. The authors prove that anchor-view minimization preserves maximum mutual information with ground-truth targets and empirically demonstrate that SEGO outperforms 14 baselines across 10 dataset pairs, with an average improvement of 3.7% in OOD detection and notable gains in anomaly detection. The approach offers a principled, structure-aware alternative to augmentation-heavy graph self-supervised methods and demonstrates practical impact for reliable graph-based systems.

Abstract

With the emerging of huge amount of unlabeled data, unsupervised out-of-distribution (OOD) detection is vital for ensuring the reliability of graph neural networks (GNNs) by identifying OOD samples from in-distribution (ID) ones during testing, where encountering novel or unknown data is inevitable. Existing methods often suffer from compromised performance due to redundant information in graph structures, which impairs their ability to effectively differentiate between ID and OOD data. To address this challenge, we propose SEGO, an unsupervised framework that integrates structural entropy into OOD detection regarding graph classification. Specifically, within the architecture of contrastive learning, SEGO introduces an anchor view in the form of coding tree by minimizing structural entropy. The obtained coding tree effectively removes redundant information from graphs while preserving essential structural information, enabling the capture of distinct graph patterns between ID and OOD samples. Furthermore, we present a multi-grained contrastive learning scheme at local, global, and tree levels using triplet views, where coding trees with essential information serve as the anchor view. Extensive experiments on real-world datasets validate the effectiveness of SEGO, demonstrating superior performance over state-of-the-art baselines in OOD detection. Specifically, our method achieves the best performance on 9 out of 10 dataset pairs, with an average improvement of 3.7\% on OOD detection datasets, significantly surpassing the best competitor by 10.8\% on the FreeSolv/ToxCast dataset pair.

Paper Structure

This paper contains 22 sections, 2 theorems, 14 equations, 6 figures, 3 tables.

Key Result

Lemma 1

Given that $f$ is a GNN encoder with learnable parameters and $G^\ast$ is the target anchor view of graph $G$. If $I(f(G^\ast); f(G))$ reaches its maximum, then $I(f(G^\ast); G)$ will also reach its maximum.

Figures (6)

  • Figure 1: A toy example of ID and OOD graphs and scoring distributions before/after structural entropy minimization.
  • Figure 2: Overview of our proposed SEGO, which employs multi-grained contrast at local, global, and tree levels using triplet views. The coding tree, obtained by minimizing structural entropy of graph, serves as the essential anchor view that eliminates redundant information. As shown in subfigure on the right, message passing and aggregation on graph are under the guidance of coding tree.
  • Figure 3: The effectiveness of different views.
  • Figure 4: T-SNE visualization of embeddings.
  • Figure 5: The natural hierarchy of graph.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 1
  • Lemma 1
  • Lemma 2