Structural Entropy Guided Unsupervised Graph Out-Of-Distribution Detection
Yue Hou, He Zhu, Ruomei Liu, Yingke Su, Jinxiang Xia, Junran Wu, Ke Xu
TL;DR
This work tackles unsupervised graph-level OOD detection by minimizing structural entropy to extract essential graph information and remove redundancy. It proposes SEGO, a framework that builds a coding-tree anchor as the redundancy-eliminated view and uses a multi-grained, triplet-view contrastive learning scheme (basic graph, anchor tree, and topological views) to maximize agreement across local, global, and tree levels. The authors prove that anchor-view minimization preserves maximum mutual information with ground-truth targets and empirically demonstrate that SEGO outperforms 14 baselines across 10 dataset pairs, with an average improvement of 3.7% in OOD detection and notable gains in anomaly detection. The approach offers a principled, structure-aware alternative to augmentation-heavy graph self-supervised methods and demonstrates practical impact for reliable graph-based systems.
Abstract
With the emerging of huge amount of unlabeled data, unsupervised out-of-distribution (OOD) detection is vital for ensuring the reliability of graph neural networks (GNNs) by identifying OOD samples from in-distribution (ID) ones during testing, where encountering novel or unknown data is inevitable. Existing methods often suffer from compromised performance due to redundant information in graph structures, which impairs their ability to effectively differentiate between ID and OOD data. To address this challenge, we propose SEGO, an unsupervised framework that integrates structural entropy into OOD detection regarding graph classification. Specifically, within the architecture of contrastive learning, SEGO introduces an anchor view in the form of coding tree by minimizing structural entropy. The obtained coding tree effectively removes redundant information from graphs while preserving essential structural information, enabling the capture of distinct graph patterns between ID and OOD samples. Furthermore, we present a multi-grained contrastive learning scheme at local, global, and tree levels using triplet views, where coding trees with essential information serve as the anchor view. Extensive experiments on real-world datasets validate the effectiveness of SEGO, demonstrating superior performance over state-of-the-art baselines in OOD detection. Specifically, our method achieves the best performance on 9 out of 10 dataset pairs, with an average improvement of 3.7\% on OOD detection datasets, significantly surpassing the best competitor by 10.8\% on the FreeSolv/ToxCast dataset pair.
