Table of Contents
Fetching ...

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu

TL;DR

The paper tackles the vulnerability of conditional language models to out-of-distribution inputs by introducing lightweight, embedding-based OOD scores built from CLM input and output representations. It demonstrates that perplexity is unreliable for OOD detection in CLMs and shows that Gaussian-based (MD/RMD) distances on embeddings provide strong discrimination between in-domain and OOD data for summarization and translation. Beyond detection, the authors couple these OOD scores with perplexity to enable selective generation under distribution shift, achieving superior quality-abstention trade-offs as evidenced by human ratings and BLEURT/ROUGE metrics. The findings offer a practical pathway to safer deployment of generative LMs, including under domain shifts, with broad applicability to sequence-to-sequence tasks and potentially decoder-only models.

Abstract

Machine learning algorithms typically assume independent and identically distributed samples in training and at test time. Much work has shown that high-performing ML classifiers can degrade significantly and provide overly-confident, wrong classification predictions, particularly for out-of-distribution (OOD) inputs. Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence, and may suffer even worse degradation on OOD inputs as the prediction is done auto-regressively over many steps. Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output. We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation. We also show how our method can be used under the common and realistic setting of distribution shift for selective generation (analogous to selective prediction for classification) of high-quality outputs, while automatically abstaining from low-quality ones, enabling safer deployment of generative language models.

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

TL;DR

The paper tackles the vulnerability of conditional language models to out-of-distribution inputs by introducing lightweight, embedding-based OOD scores built from CLM input and output representations. It demonstrates that perplexity is unreliable for OOD detection in CLMs and shows that Gaussian-based (MD/RMD) distances on embeddings provide strong discrimination between in-domain and OOD data for summarization and translation. Beyond detection, the authors couple these OOD scores with perplexity to enable selective generation under distribution shift, achieving superior quality-abstention trade-offs as evidenced by human ratings and BLEURT/ROUGE metrics. The findings offer a practical pathway to safer deployment of generative LMs, including under domain shifts, with broad applicability to sequence-to-sequence tasks and potentially decoder-only models.

Abstract

Machine learning algorithms typically assume independent and identically distributed samples in training and at test time. Much work has shown that high-performing ML classifiers can degrade significantly and provide overly-confident, wrong classification predictions, particularly for out-of-distribution (OOD) inputs. Conditional language models (CLMs) are predominantly trained to classify the next token in an output sequence, and may suffer even worse degradation on OOD inputs as the prediction is done auto-regressively over many steps. Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output. We present a highly accurate and lightweight OOD detection method for CLMs, and demonstrate its effectiveness on abstractive summarization and translation. We also show how our method can be used under the common and realistic setting of distribution shift for selective generation (analogous to selective prediction for classification) of high-quality outputs, while automatically abstaining from low-quality ones, enabling safer deployment of generative language models.
Paper Structure (29 sections, 8 equations, 17 figures, 14 tables, 2 algorithms)

This paper contains 29 sections, 8 equations, 17 figures, 14 tables, 2 algorithms.

Figures (17)

  • Figure 1: Perplexity scores density of a CLM trained on (a) xsum for summarization, and (b) WMT for translation, evaluated on other datasets/domains. Perplexity is not well suited for OOD detection due to significant overlap between in-domain and OOD scores.
  • Figure 2: The proposed OOD detector based on input and output embeddings.
  • Figure 3: Density of RMD (left) and Binary logits (right) OOD scores evaluated on summarization datasets. RMD is better at distinguishing near-OOD from far-OOD.
  • Figure 4: The Kendall's $\tau$ correlation between perplexity and (a) ROUGE-1, (b) human evaluation median rating, and (c) BLEURT decreases as OOD score increases respectively. Note that we use output RMD OOD score for summarization and input RMD OOD score for translation.
  • Figure 5: (a) The Quality (human eval) vs Abstention curve for summarization. Combined scores have the highest quality at almost all abstention rates. (b) Survival count of each dataset as a function of abstention rate, using $\text{PR}_{\text{sum}}$ (we use output/input RMD for summarization/translation to pair with perplexity). OOD data is abstained earlier than in-domain. (c, d) The same as (a, b) for translation.
  • ...and 12 more figures