Table of Contents
Fetching ...

GOODAT: Towards Test-time Graph Out-of-Distribution Detection

Luzhi Wang, Dongxiao He, He Zhang, Yixin Liu, Wenjie Wang, Shirui Pan, Di Jin, Tat-Seng Chua

TL;DR

This paper tackles the problem of detecting graph out-of-distribution samples at test time, addressing limitations of training-based and data-centric OOD methods. It introduces GOODAT, a data-centric, unsupervised, plug-and-play detector that uses a lightweight graph masker and three Graph Information Bottleneck–based losses to extract informative subgraphs and separate ID from OOD patterns without modifying the GNN backbone or relying on training data. The core contributions are (i) the subgraph GIB loss, (ii) the masked graph GIB loss, and (iii) a Copula-based graph distribution separating loss, jointly optimizing a final loss L_g to enable robust OOD detection; experiments show strong improvements across multiple real-world datasets and even applicability to graph anomaly detection. Overall, GOODAT offers a practical, training-data–free solution that can be applied to any pretrained GNN, enhancing reliability in open-world graph applications with minimal computational overhead.

Abstract

Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains. While GNNs excel in scenarios where the testing data shares the distribution of their training counterparts (in distribution, ID), they often exhibit incorrect predictions when confronted with samples from an unfamiliar distribution (out-of-distribution, OOD). To identify and reject OOD samples with GNNs, recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN. Despite their effectiveness, these methods come with heavy training resources and costs, as they need to optimize the GNN-based models on training data. Moreover, their reliance on modifying the original GNNs and accessing training data further restricts their universality. To this end, this paper introduces a method to detect Graph Out-of-Distribution At Test-time (namely GOODAT), a data-centric, unsupervised, and plug-and-play solution that operates independently of training data and modifications of GNN architecture. With a lightweight graph masker, GOODAT can learn informative subgraphs from test samples, enabling the capture of distinct graph patterns between OOD and ID samples. To optimize the graph masker, we meticulously design three unsupervised objective functions based on the graph information bottleneck principle, motivating the masker to capture compact yet informative subgraphs for OOD detection. Comprehensive evaluations confirm that our GOODAT method outperforms state-of-the-art benchmarks across a variety of real-world datasets. The code is available at Github: https://github.com/Ee1s/GOODAT

GOODAT: Towards Test-time Graph Out-of-Distribution Detection

TL;DR

This paper tackles the problem of detecting graph out-of-distribution samples at test time, addressing limitations of training-based and data-centric OOD methods. It introduces GOODAT, a data-centric, unsupervised, plug-and-play detector that uses a lightweight graph masker and three Graph Information Bottleneck–based losses to extract informative subgraphs and separate ID from OOD patterns without modifying the GNN backbone or relying on training data. The core contributions are (i) the subgraph GIB loss, (ii) the masked graph GIB loss, and (iii) a Copula-based graph distribution separating loss, jointly optimizing a final loss L_g to enable robust OOD detection; experiments show strong improvements across multiple real-world datasets and even applicability to graph anomaly detection. Overall, GOODAT offers a practical, training-data–free solution that can be applied to any pretrained GNN, enhancing reliability in open-world graph applications with minimal computational overhead.

Abstract

Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains. While GNNs excel in scenarios where the testing data shares the distribution of their training counterparts (in distribution, ID), they often exhibit incorrect predictions when confronted with samples from an unfamiliar distribution (out-of-distribution, OOD). To identify and reject OOD samples with GNNs, recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN. Despite their effectiveness, these methods come with heavy training resources and costs, as they need to optimize the GNN-based models on training data. Moreover, their reliance on modifying the original GNNs and accessing training data further restricts their universality. To this end, this paper introduces a method to detect Graph Out-of-Distribution At Test-time (namely GOODAT), a data-centric, unsupervised, and plug-and-play solution that operates independently of training data and modifications of GNN architecture. With a lightweight graph masker, GOODAT can learn informative subgraphs from test samples, enabling the capture of distinct graph patterns between OOD and ID samples. To optimize the graph masker, we meticulously design three unsupervised objective functions based on the graph information bottleneck principle, motivating the masker to capture compact yet informative subgraphs for OOD detection. Comprehensive evaluations confirm that our GOODAT method outperforms state-of-the-art benchmarks across a variety of real-world datasets. The code is available at Github: https://github.com/Ee1s/GOODAT
Paper Structure (21 sections, 13 equations, 5 figures, 3 tables)

This paper contains 21 sections, 13 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Comparisons between GOODAT and other methods. To detect OOD samples, (a) most GNN-based methods need to learn a detector from the training data DBLP:conf/wsdm/LiuD0P23; (b) other data-centric methods learn an MLP to modify the training data while keeping the well-trained GNN fixed DBLP:conf/kdd/Guo0CLSD23. (c) In contrast, our test-time OOD detector directly works on the test data without needing to consult the training data and change the parameters of the well-trained GNN.
  • Figure 2: Overview of GOODAT. In the GOODAT training process, a graph masker $M$ is applied on the input test graph $G$, consisting of two parameterized matrices. This graph masker $M$ is trained by utilizing three GIB-boosted losses, taking the graph $G$ and its corresponding surrogate ID label $Y$ as inputs. The informative subgraph $Z$ and the masked graph $Z'$ are obtained with the trainable parameters $M$ (e.g., $Z = G \odot M$). During the inference phase of target GNNs, the OOD score of a test graph is obtained by the graph masker and GIB-boosted losses to infer if the input graph is an OOD graph.
  • Figure 3: The core idea of GOODAT. 'Inf.' is the abbreviation of information.
  • Figure 4: Illustrations of the three GIB-boosted losses in GOODAT.
  • Figure 5: Parameter sensitivity analysis and visualization.

Theorems & Definitions (1)

  • Definition 1: Test-time graph OOD detection