Modular Safety Guardrails Are Necessary for Foundation-Model-Enabled Robots in the Real World
Joonkyung Kim, Wenxi Chen, Davood Soleymanzadeh, Yi Ding, Xiangbo Gao, Zhengzhong Tu, Ruqi Zhang, Fan Fei, Sushant Veer, Yiwei Lyu, Minghui Zheng, Yan Gu
TL;DR
Foundation models enable broad capabilities for robots but introduce action, decision, and human-centered safety challenges in open environments. The authors propose a two-layer modular safety guardrail with a Monitoring and Evaluation Layer and an Intervention Layer (comprising a planning-level Decision Gate and an execution-level Action Gate) to enforce safety independently of FM outputs. They discuss representation alignment and conservatism allocation as co-design opportunities and provide deployment examples to illustrate cross-layer safety enforcement. While not a universal safety solution, modular guardrails offer a practical, auditable foundation to improve safety, verifiability, and deployment of FM-enabled robots in unstructured settings.
Abstract
The integration of foundation models (FMs) into robotics has accelerated real-world deployment, while introducing new safety challenges arising from open-ended semantic reasoning and embodied physical action. These challenges require safety notions beyond physical constraint satisfaction. In this paper, we characterize FM-enabled robot safety along three dimensions: action safety (physical feasibility and constraint compliance), decision safety (semantic and contextual appropriateness), and human-centered safety (conformance to human intent, norms, and expectations). We argue that existing approaches, including static verification, monolithic controllers, and end-to-end learned policies, are insufficient in settings where tasks, environments, and human expectations are open-ended, long-tailed, and subject to adaptation over time. To address this gap, we propose modular safety guardrails, consisting of monitoring (evaluation) and intervention layers, as an architectural foundation for comprehensive safety across the autonomy stack. Beyond modularity, we highlight possible cross-layer co-design opportunities through representation alignment and conservatism allocation to enable faster, less conservative, and more effective safety enforcement. We call on the community to explore richer guardrail modules and principled co-design strategies to advance safe real-world physical AI deployment.
