Towards Safe Robot Foundation Models
Maximilian Tölle, Theo Gruner, Daniel Palenicek, Jonas Günster, Puze Liu, Joe Watson, Davide Tateo, Jan Peters
TL;DR
Robot foundation models often lack formal safety guarantees and can fail under distribution shifts. The paper introduces a final safety layer based on ATACOM that constructs a safe action space and maps actions into the tangent space of a safety constraint manifold for a control-affine system, yielding $a_{\text{safe}} = \psi_{G,f}(\mathbf{a}, \mathbf{s}, g)$. The safety module sits atop a pre-trained visual-language-action policy and requires only predefined differentiable safety constraints $g$ to operate, avoiding additional safety fine-tuning. Demonstrations on an air hockey task show the module keeps all trajectories within constraints and improves success rates with continued training, while a baseline without safety fails under the same conditions, highlighting practical impact for deploying generalist policies in safety-critical settings.
Abstract
Robot foundation models hold the potential for deployment across diverse environments, from industrial applications to household tasks. While current research focuses primarily on the policies' generalization capabilities across a variety of tasks, it fails to address safety, a critical requirement for deployment on real-world systems. In this paper, we introduce a safety layer designed to constrain the action space of any generalist policy appropriately. Our approach uses ATACOM, a safe reinforcement learning algorithm that creates a safe action space and, therefore, ensures safe state transitions. By extending ATACOM to generalist policies, our method facilitates their deployment in safety-critical scenarios without requiring any specific safety fine-tuning. We demonstrate the effectiveness of this safety layer in an air hockey environment, where it prevents a puck-hitting agent from colliding with its surroundings, a failure observed in generalist policies.
