AP-OOD: Attention Pooling for Out-of-Distribution Detection
Claus Hofmann, Christian Huber, Bernhard Lehner, Daniel Klotz, Sepp Hochreiter, Werner Zellinger
TL;DR
AP-OOD introduces an attention-based pooling mechanism for OOD detection in natural language, replacing mean pooling to preserve token-level structure. By integrating a directional Mahalanobis-like distance with learnable heads and optional matrix-valued queries, it provides a semi-supervised framework that can leverage AUX data when available. In unsupervised settings, AP-OOD achieves state-of-the-art FPR95 reductions on XSUM and WMT15 En-Fr, and in supervised settings its performance improves as AUX data increases. The approach generalizes across summarization, translation, and audio tasks, offering a practical, scalable solution for reliable NLP deployment and a framework that can adapt to varying levels of auxiliary supervision.
Abstract
Out-of-distribution (OOD) detection, which maps high-dimensional data into a scalar OOD score, is critical for the reliable deployment of machine learning models. A key challenge in recent research is how to effectively leverage and aggregate token embeddings from language models to obtain the OOD score. In this work, we propose AP-OOD, a novel OOD detection method for natural language that goes beyond simple average-based aggregation by exploiting token-level information. AP-OOD is a semi-supervised approach that flexibly interpolates between unsupervised and supervised settings, enabling the use of limited auxiliary outlier data. Empirically, AP-OOD sets a new state of the art in OOD detection for text: in the unsupervised setting, it reduces the FPR95 (false positive rate at 95% true positives) from 27.84% to 4.67% on XSUM summarization, and from 77.08% to 70.37% on WMT15 En-Fr translation.
