Table of Contents
Fetching ...

Runtime Safety Monitoring of Deep Neural Networks for Perception: A Survey

Albert Schotschneider, Svetlana Pavlitska, J. Marius Zöllner

TL;DR

This survey tackles the problem of ensuring runtime safety for DNN-based perception systems by focusing on inference-time, external monitors that do not modify the monitored model. It provides a taxonomy—Monitoring Inputs, Monitoring Internal Representations, Monitoring Outputs, and Combined Approaches—and maps each category to DNN safety concerns such as generalization gaps, OOD inputs, and adversarial attacks. The paper synthesizes a wide range of methods, datasets, and architectures, discusses their strengths and limitations, and outlines open challenges and future directions toward integrated, efficient, and explainable monitoring. The findings offer practical guidance for deploying robust, certifiable perception systems in real time and highlight research directions for safer DNN deployments in safety-critical settings.

Abstract

Deep neural networks (DNNs) are widely used in perception systems for safety-critical applications, such as autonomous driving and robotics. However, DNNs remain vulnerable to various safety concerns, including generalization errors, out-of-distribution (OOD) inputs, and adversarial attacks, which can lead to hazardous failures. This survey provides a comprehensive overview of runtime safety monitoring approaches, which operate in parallel to DNNs during inference to detect these safety concerns without modifying the DNN itself. We categorize existing methods into three main groups: Monitoring inputs, internal representations, and outputs. We analyze the state-of-the-art for each category, identify strengths and limitations, and map methods to the safety concerns they address. In addition, we highlight open challenges and future research directions.

Runtime Safety Monitoring of Deep Neural Networks for Perception: A Survey

TL;DR

This survey tackles the problem of ensuring runtime safety for DNN-based perception systems by focusing on inference-time, external monitors that do not modify the monitored model. It provides a taxonomy—Monitoring Inputs, Monitoring Internal Representations, Monitoring Outputs, and Combined Approaches—and maps each category to DNN safety concerns such as generalization gaps, OOD inputs, and adversarial attacks. The paper synthesizes a wide range of methods, datasets, and architectures, discusses their strengths and limitations, and outlines open challenges and future directions toward integrated, efficient, and explainable monitoring. The findings offer practical guidance for deploying robust, certifiable perception systems in real time and highlight research directions for safer DNN deployments in safety-critical settings.

Abstract

Deep neural networks (DNNs) are widely used in perception systems for safety-critical applications, such as autonomous driving and robotics. However, DNNs remain vulnerable to various safety concerns, including generalization errors, out-of-distribution (OOD) inputs, and adversarial attacks, which can lead to hazardous failures. This survey provides a comprehensive overview of runtime safety monitoring approaches, which operate in parallel to DNNs during inference to detect these safety concerns without modifying the DNN itself. We categorize existing methods into three main groups: Monitoring inputs, internal representations, and outputs. We analyze the state-of-the-art for each category, identify strengths and limitations, and map methods to the safety concerns they address. In addition, we highlight open challenges and future research directions.

Paper Structure

This paper contains 12 sections, 1 figure, 2 tables.

Figures (1)

  • Figure 1: A system architecture for monitoring a DNN using a runtime safety monitor, which can observe inputs $\boldsymbol{x}$, internal layers $\boldsymbol{z}$, or outputs $\boldsymbol{y}$ of the monitored DNN to detect safety concerns and trigger a safety alert. The neural network component $\Phi$ transforms the input $\boldsymbol{x}$ into a latent representation $\boldsymbol{z}$, while component $\Psi$ of the system may handle diverse tasks, e.g., classification, segmentation, or decoding.