Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection

Li Sun; Lanxu Yang; Jiayu Tian; Bowen Fang; Xiaoyan Yu; Junda Ye; Peng Tang; Hao Peng; Philip S. Yu

Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection

Li Sun, Lanxu Yang, Jiayu Tian, Bowen Fang, Xiaoyan Yu, Junda Ye, Peng Tang, Hao Peng, Philip S. Yu

TL;DR

A Policy-Guided Outlier Synthesis (PGOS) framework is proposed that replaces static heuristics with a learned exploration strategy that achieves state-of-the-art performance on multiple graph OOD and anomaly detection benchmarks.

Abstract

Detecting out-of-distribution (OOD) graphs is crucial for ensuring the safety and reliability of Graph Neural Networks. In unsupervised graph-level OOD detection, models are typically trained using only in-distribution (ID) data, resulting in incomplete feature space characterization and weak decision boundaries. Although synthesizing outliers offers a promising solution, existing approaches rely on fixed, non-adaptive sampling heuristics (e.g., distance- or density-based), limiting their ability to explore informative OOD regions. We propose a Policy-Guided Outlier Synthesis (PGOS) framework that replaces static heuristics with a learned exploration strategy. Specifically, PGOS trains a reinforcement learning agent to navigate low-density regions in a structured latent space and sample representations that most effectively refine the OOD decision boundary. These representations are then decoded into high-quality pseudo-OOD graphs to improve detector robustness. Extensive experiments demonstrate that PGOS achieves state-of-the-art performance on multiple graph OOD and anomaly detection benchmarks.

Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection

TL;DR

Abstract

Paper Structure (30 sections, 13 equations, 5 figures, 3 tables)

This paper contains 30 sections, 13 equations, 5 figures, 3 tables.

Introduction
Related Work
Graph-level Out-of-Distribution Detection.
Outlier Synthesis for OOD Detection.
Preliminaries and Problem Definition
Unsupervised Graph-level OOD Detection.
Reinforcement Learning.
Methodology
Prototypical Representation Learning for Graphs
Prototypical Contrastive Objective.
Generative Reconstruction Loss.
Policy-Guided Outlier Synthesis
MDP Formulation for Outlier Synthesis.
Reward Function Design.
Boundary Constraint.
...and 15 more sections

Figures (5)

Figure 1: Comparisons between policy-guided and heuristic-based sampling. (a) Policy-guided sampling via a learned agent. (b) Heuristic sampling follows predefined rules.
Figure 2: An overview of our PGOS framework based on pseudo-outlier synthesis.
Figure 3: T-SNE visualizations of graph embeddings learned by PGCL on three different datasets. Each point denotes a graph, and colors represent the $K=8$ distinct clusters.
Figure 4: T-SNE visualization of sampled points generated by the RL-based and Gaussian sampling strategies on the BBBP+BACE OOD dataset.
Figure 5: Sensitivity analysis of the number of prototypes $K$ on OOD detection performance across different datasets.

Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection

TL;DR

Abstract

Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (5)