Table of Contents
Fetching ...

Modular Safety Guardrails Are Necessary for Foundation-Model-Enabled Robots in the Real World

Joonkyung Kim, Wenxi Chen, Davood Soleymanzadeh, Yi Ding, Xiangbo Gao, Zhengzhong Tu, Ruqi Zhang, Fan Fei, Sushant Veer, Yiwei Lyu, Minghui Zheng, Yan Gu

TL;DR

Foundation models enable broad capabilities for robots but introduce action, decision, and human-centered safety challenges in open environments. The authors propose a two-layer modular safety guardrail with a Monitoring and Evaluation Layer and an Intervention Layer (comprising a planning-level Decision Gate and an execution-level Action Gate) to enforce safety independently of FM outputs. They discuss representation alignment and conservatism allocation as co-design opportunities and provide deployment examples to illustrate cross-layer safety enforcement. While not a universal safety solution, modular guardrails offer a practical, auditable foundation to improve safety, verifiability, and deployment of FM-enabled robots in unstructured settings.

Abstract

The integration of foundation models (FMs) into robotics has accelerated real-world deployment, while introducing new safety challenges arising from open-ended semantic reasoning and embodied physical action. These challenges require safety notions beyond physical constraint satisfaction. In this paper, we characterize FM-enabled robot safety along three dimensions: action safety (physical feasibility and constraint compliance), decision safety (semantic and contextual appropriateness), and human-centered safety (conformance to human intent, norms, and expectations). We argue that existing approaches, including static verification, monolithic controllers, and end-to-end learned policies, are insufficient in settings where tasks, environments, and human expectations are open-ended, long-tailed, and subject to adaptation over time. To address this gap, we propose modular safety guardrails, consisting of monitoring (evaluation) and intervention layers, as an architectural foundation for comprehensive safety across the autonomy stack. Beyond modularity, we highlight possible cross-layer co-design opportunities through representation alignment and conservatism allocation to enable faster, less conservative, and more effective safety enforcement. We call on the community to explore richer guardrail modules and principled co-design strategies to advance safe real-world physical AI deployment.

Modular Safety Guardrails Are Necessary for Foundation-Model-Enabled Robots in the Real World

TL;DR

Foundation models enable broad capabilities for robots but introduce action, decision, and human-centered safety challenges in open environments. The authors propose a two-layer modular safety guardrail with a Monitoring and Evaluation Layer and an Intervention Layer (comprising a planning-level Decision Gate and an execution-level Action Gate) to enforce safety independently of FM outputs. They discuss representation alignment and conservatism allocation as co-design opportunities and provide deployment examples to illustrate cross-layer safety enforcement. While not a universal safety solution, modular guardrails offer a practical, auditable foundation to improve safety, verifiability, and deployment of FM-enabled robots in unstructured settings.

Abstract

The integration of foundation models (FMs) into robotics has accelerated real-world deployment, while introducing new safety challenges arising from open-ended semantic reasoning and embodied physical action. These challenges require safety notions beyond physical constraint satisfaction. In this paper, we characterize FM-enabled robot safety along three dimensions: action safety (physical feasibility and constraint compliance), decision safety (semantic and contextual appropriateness), and human-centered safety (conformance to human intent, norms, and expectations). We argue that existing approaches, including static verification, monolithic controllers, and end-to-end learned policies, are insufficient in settings where tasks, environments, and human expectations are open-ended, long-tailed, and subject to adaptation over time. To address this gap, we propose modular safety guardrails, consisting of monitoring (evaluation) and intervention layers, as an architectural foundation for comprehensive safety across the autonomy stack. Beyond modularity, we highlight possible cross-layer co-design opportunities through representation alignment and conservatism allocation to enable faster, less conservative, and more effective safety enforcement. We call on the community to explore richer guardrail modules and principled co-design strategies to advance safe real-world physical AI deployment.
Paper Structure (20 sections, 2 figures)

This paper contains 20 sections, 2 figures.

Figures (2)

  • Figure 1: Overview of the safety definitions (Sec. \ref{['sec: safety_def']}), source of safety challenges (Sec. \ref{['sec: fm_safety_challenges']}), and existing alternative methods (Sec. \ref{['sec: sota_safety_limitations']}).
  • Figure 2: Overview of one potential modular safety guardrail architecture. It shows the architecture and information flow between the FM-enabled robotic stack (top) and the modular safety guardrail (bottom). Arrows 1–3 denote information flow from perception, planning, and control to the Monitor and Evaluation Layer, which generates risk signals for downstream modules. Risk indicators from planning and control are sent to the Intervention Layer (Arrow 4), consisting of a Decision Gate that screens plans and triggers replanning upon rejection (Arrow 5) and an Action Gate that enforces physical safety constraints on control commands. The Action Gate may apply last-resort safety filters to ensure only physically safe actions are executed (Arrow 6).