Table of Contents
Fetching ...

AP-OOD: Attention Pooling for Out-of-Distribution Detection

Claus Hofmann, Christian Huber, Bernhard Lehner, Daniel Klotz, Sepp Hochreiter, Werner Zellinger

TL;DR

AP-OOD introduces an attention-based pooling mechanism for OOD detection in natural language, replacing mean pooling to preserve token-level structure. By integrating a directional Mahalanobis-like distance with learnable heads and optional matrix-valued queries, it provides a semi-supervised framework that can leverage AUX data when available. In unsupervised settings, AP-OOD achieves state-of-the-art FPR95 reductions on XSUM and WMT15 En-Fr, and in supervised settings its performance improves as AUX data increases. The approach generalizes across summarization, translation, and audio tasks, offering a practical, scalable solution for reliable NLP deployment and a framework that can adapt to varying levels of auxiliary supervision.

Abstract

Out-of-distribution (OOD) detection, which maps high-dimensional data into a scalar OOD score, is critical for the reliable deployment of machine learning models. A key challenge in recent research is how to effectively leverage and aggregate token embeddings from language models to obtain the OOD score. In this work, we propose AP-OOD, a novel OOD detection method for natural language that goes beyond simple average-based aggregation by exploiting token-level information. AP-OOD is a semi-supervised approach that flexibly interpolates between unsupervised and supervised settings, enabling the use of limited auxiliary outlier data. Empirically, AP-OOD sets a new state of the art in OOD detection for text: in the unsupervised setting, it reduces the FPR95 (false positive rate at 95% true positives) from 27.84% to 4.67% on XSUM summarization, and from 77.08% to 70.37% on WMT15 En-Fr translation.

AP-OOD: Attention Pooling for Out-of-Distribution Detection

TL;DR

AP-OOD introduces an attention-based pooling mechanism for OOD detection in natural language, replacing mean pooling to preserve token-level structure. By integrating a directional Mahalanobis-like distance with learnable heads and optional matrix-valued queries, it provides a semi-supervised framework that can leverage AUX data when available. In unsupervised settings, AP-OOD achieves state-of-the-art FPR95 reductions on XSUM and WMT15 En-Fr, and in supervised settings its performance improves as AUX data increases. The approach generalizes across summarization, translation, and audio tasks, offering a practical, scalable solution for reliable NLP deployment and a framework that can adapt to varying levels of auxiliary supervision.

Abstract

Out-of-distribution (OOD) detection, which maps high-dimensional data into a scalar OOD score, is critical for the reliable deployment of machine learning models. A key challenge in recent research is how to effectively leverage and aggregate token embeddings from language models to obtain the OOD score. In this work, we propose AP-OOD, a novel OOD detection method for natural language that goes beyond simple average-based aggregation by exploiting token-level information. AP-OOD is a semi-supervised approach that flexibly interpolates between unsupervised and supervised settings, enabling the use of limited auxiliary outlier data. Empirically, AP-OOD sets a new state of the art in OOD detection for text: in the unsupervised setting, it reduces the FPR95 (false positive rate at 95% true positives) from 27.84% to 4.67% on XSUM summarization, and from 77.08% to 70.37% on WMT15 En-Fr translation.
Paper Structure (53 sections, 43 equations, 9 figures, 11 tables, 4 algorithms)

This paper contains 53 sections, 43 equations, 9 figures, 11 tables, 4 algorithms.

Figures (9)

  • Figure 1: Illustrative example for the failure of mean pooling. (Left) ID and OOD sequences $\bm{Z}_i \in \mathbb{R}^{2 \times 2}$, where each sequence contains a pair of token embeddings with two features each. Token embeddings that belong to the same sequence are connected with lines. (Center) The means of the ID and OOD sequences both cluster around the origin. (Right) A mean pooling approach cannot discriminate between the ID and OOD sequences.
  • Figure 3: OOD detection performance on the input token embeddings of PEGASUSLARGE trained on XSUM. We vary the number of AUX samples and compare AP-OOD, binary logits ren2022out, Deep SAD ruff2019deep, and relative Mahalanobis ren2022out. AP-OOD attains the highest AUROC independent of AUX sample count.
  • Figure 4: OOD detection performance on the input token embeddings of PEGASUSLARGE trained on XSUM when scaling to large $M$ and $T$. We vary $M \in \{1, 16, 128, 1024\}$, $T \in \{1, 4, 16\}$, and $\beta \in \{1 / \sqrt{D}, 0.25, 0.5, 1, 2\}$. (Left) Mean AUROC in $\%$ for the best $\beta$ at each $(M, T)$ combination. (Right) The best $\beta$ selected at each $(M, T)$ combination.
  • Figure 5: OOD detection performance on text summarization for all OOD data sets. We vary the number of AUX examples and compare results from AP-OOD, binary logits ren2022out, relative Mahalanobis ren2022out, and Deep SAD ruff2019deep.
  • Figure 6: AP-OOD's attention weights on randomly selected output sequences from OOD data sets on text summarization. For each sequence, we visualize the heads $j$ with the highest deviation in the positive and negative direction of the $d_j(\bm{Z})$ before applying the square.
  • ...and 4 more figures

Theorems & Definitions (1)

  • proof