Table of Contents
Fetching ...

HGOE: Hybrid External and Internal Graph Outlier Exposure for Graph Out-of-Distribution Detection

Junwei He, Qianqian Xu, Yangbangyan Jiang, Zitai Wang, Yuchen Sun, Qingming Huang

TL;DR

This work tackles graph-level OOD detection by introducing Hybrid External and Internal Graph Outlier Exposure (HGOE), which combines diverse external outliers from cross-domain sources with synthesized internal outliers created from within in-distribution subgroups using graphon-based mixes. A boundary-aware OE loss guides learning by weighting outliers toward the ID boundary while suppressing invalid intrusions, enabling integration with existing detectors. Key innovations include ID-mixup to generate internal outliers via graphon mixtures $\mathcal{M}=\lambda W_i+(1-\lambda)W_j$ and feature alignment to external outliers, plus a principled loss that adaptively emphasizes boundary samples through $\ell_{ba}$. Empirical results on 8 real-world datasets show that HGOE consistently improves graph OOD performance and that both external and internal outliers contribute to gains, with interpretable visualizations and comprehensive ablations supporting the design choices.

Abstract

With the progressive advancements in deep graph learning, out-of-distribution (OOD) detection for graph data has emerged as a critical challenge. While the efficacy of auxiliary datasets in enhancing OOD detection has been extensively studied for image and text data, such approaches have not yet been explored for graph data. Unlike Euclidean data, graph data exhibits greater diversity but lower robustness to perturbations, complicating the integration of outliers. To tackle these challenges, we propose the introduction of \textbf{H}ybrid External and Internal \textbf{G}raph \textbf{O}utlier \textbf{E}xposure (HGOE) to improve graph OOD detection performance. Our framework involves using realistic external graph data from various domains and synthesizing internal outliers within ID subgroups to address the poor robustness and presence of OOD samples within the ID class. Furthermore, we develop a boundary-aware OE loss that adaptively assigns weights to outliers, maximizing the use of high-quality OOD samples while minimizing the impact of low-quality ones. Our proposed HGOE framework is model-agnostic and designed to enhance the effectiveness of existing graph OOD detection models. Experimental results demonstrate that our HGOE framework can significantly improve the performance of existing OOD detection models across all 8 real datasets.

HGOE: Hybrid External and Internal Graph Outlier Exposure for Graph Out-of-Distribution Detection

TL;DR

This work tackles graph-level OOD detection by introducing Hybrid External and Internal Graph Outlier Exposure (HGOE), which combines diverse external outliers from cross-domain sources with synthesized internal outliers created from within in-distribution subgroups using graphon-based mixes. A boundary-aware OE loss guides learning by weighting outliers toward the ID boundary while suppressing invalid intrusions, enabling integration with existing detectors. Key innovations include ID-mixup to generate internal outliers via graphon mixtures and feature alignment to external outliers, plus a principled loss that adaptively emphasizes boundary samples through . Empirical results on 8 real-world datasets show that HGOE consistently improves graph OOD performance and that both external and internal outliers contribute to gains, with interpretable visualizations and comprehensive ablations supporting the design choices.

Abstract

With the progressive advancements in deep graph learning, out-of-distribution (OOD) detection for graph data has emerged as a critical challenge. While the efficacy of auxiliary datasets in enhancing OOD detection has been extensively studied for image and text data, such approaches have not yet been explored for graph data. Unlike Euclidean data, graph data exhibits greater diversity but lower robustness to perturbations, complicating the integration of outliers. To tackle these challenges, we propose the introduction of \textbf{H}ybrid External and Internal \textbf{G}raph \textbf{O}utlier \textbf{E}xposure (HGOE) to improve graph OOD detection performance. Our framework involves using realistic external graph data from various domains and synthesizing internal outliers within ID subgroups to address the poor robustness and presence of OOD samples within the ID class. Furthermore, we develop a boundary-aware OE loss that adaptively assigns weights to outliers, maximizing the use of high-quality OOD samples while minimizing the impact of low-quality ones. Our proposed HGOE framework is model-agnostic and designed to enhance the effectiveness of existing graph OOD detection models. Experimental results demonstrate that our HGOE framework can significantly improve the performance of existing OOD detection models across all 8 real datasets.
Paper Structure (26 sections, 9 equations, 5 figures, 4 tables)

This paper contains 26 sections, 9 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Illustration of the distribution differences between images (left) and graphs (right). Blue samples are from ID classes, while orange ones are from OOD classes. Notably, ID classes form clusters in images, while in graphs, they split into subgroups with potential OOD samples in between them.
  • Figure 2: Overview of the proposed HGOE framework. We collect external outliers via public graph database. Given ID graphs, we perform feature extraction using GraphCL and cluster them to obtain multiple ID subgroups. Then we estimate graphons for these subgroups and mix them up to obtain graphons for internal outliers. Sampled graph structures are further used to generate node features by aligning features from external outliers. Finally, the synthesized internal outliers and real-world external outliers are jointly optimized using a boundary-aware loss.
  • Figure 3: Visualization of graphons obtained by ID-mixup. The rows are from ENZYMES and FreeSolv datasets, respectively. The first and third columns are graphons of two subgroups, and the second column shows their mixup results. Brighter cells indicate a higher probability of edge existence at that location.
  • Figure 4: Score distributions on several graph datasets. The left column shows results without HGOE, while the right column is with HGOE. It is evident that the overlap area between ID and OOD samples becomes smaller after introducing HGOE.
  • Figure 5: Performance gain of HGOE compared to $\gamma=0$ when $\gamma$ varies.