Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

Linas Nasvytis; Kai Sandbrink; Jakob Foerster; Tim Franzmeyer; Christian Schroeder de Witt

Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

Linas Nasvytis, Kai Sandbrink, Jakob Foerster, Tim Franzmeyer, Christian Schroeder de Witt

TL;DR

This work tackles the challenge of out-of-distribution detection in reinforcement learning by first clarifying terminology and then introducing temporally correlated anomaly benchmarks via autoregressive noise. The authors propose DEXTER, a time-series feature extractor paired with an isolation-forest ensemble, and extend it with DEXTER+C, a sequential detector based on CUSUM, to achieve online OOD decisions. Their novel ARTS, ARNO, and ARNS benchmarks reveal significant performance gaps in existing detectors, with DEXTER (and especially DEXTER+C) delivering superior AUROC and substantially faster detection in many scenarios. The results advocate combining time-series based detection with sequential hypothesis testing to safely deploy RL systems in environments with temporally coherent disturbances, and point toward future work on real-world validation, cross-dimensional noise handling, and adaptive windowing.

Abstract

While reinforcement learning (RL) algorithms have been successfully applied across numerous sequential decision-making problems, their generalization to unforeseen testing environments remains a significant concern. In this paper, we study the problem of out-of-distribution (OOD) detection in RL, which focuses on identifying situations at test time that RL agents have not encountered in their training environments. We first propose a clarification of terminology for OOD detection in RL, which aligns it with the literature from other machine learning domains. We then present new benchmark scenarios for OOD detection, which introduce anomalies with temporal autocorrelation into different components of the agent-environment loop. We argue that such scenarios have been understudied in the current literature, despite their relevance to real-world situations. Confirming our theoretical predictions, our experimental results suggest that state-of-the-art OOD detectors are not able to identify such anomalies. To address this problem, we propose a novel method for OOD detection, which we call DEXTER (Detection via Extraction of Time Series Representations). By treating environment observations as time series data, DEXTER extracts salient time series features, and then leverages an ensemble of isolation forest algorithms to detect anomalies. We find that DEXTER can reliably identify anomalies across benchmark scenarios, exhibiting superior performance compared to both state-of-the-art OOD detectors and high-dimensional changepoint detectors adopted from statistics.

Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

TL;DR

Abstract

Paper Structure (40 sections, 1 equation, 2 figures, 4 tables, 5 algorithms)

This paper contains 40 sections, 1 equation, 2 figures, 4 tables, 5 algorithms.

Introduction
Related Work
Algorithms for OOD Detection
Limitations of current OOD detection approaches.
Sequential Hypothesis Testing.
Background and Notation
Terminology of OOD Detection in Reinforcement Learning
Novel Testing Scenarios for OOD Detection
Autocorrelated noise patterns
ARTS: Autoregressive Time Series environments
ARNO: Autoregressive Noised Observation environments
ARNS: Autoregressive Noised State environments
DEXTER: Detection via Extraction of Time Series Representations
Feature extractor $f$
Isolation Forest Algorithm $h$
...and 25 more sections

Figures (2)

Figure 1: Illustration of temporally autocorrelated anomalies. Left: at injection time (t = 48), noise applied to the observation changes from no correlation to 1-step autocorrelation. Right: at injection time (t = 56), noise changes from no correlation to 2-step autocorrelation.
Figure 2: Illustrations of Autoregressive (AR) model with parameters for no correlation (top), 1-step autocorrelation (bottom-left), and 2-step autocorrelation (bottom-right), which is used to create three different types of noise in new testing scenarios.

Theorems & Definitions (3)

Definition 1
Definition 2
Definition 3

Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

TL;DR

Abstract

Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (2)

Theorems & Definitions (3)