Zero-shot Safety Prediction for Autonomous Robots with Foundation World Models
Zhenjiang Mao, Siqi Dai, Yuang Geng, Ivan Ruchkin
TL;DR
This work tackles safety prediction for autonomous robots using world models by introducing foundation world models that align observations with causally meaningful latent representations via segmentation (SAM) and a training-free latent predictor (LLM). The approach replaces opaque pixel-based evaluation with an object-centric centroid-distance metric and enables zero-shot forecasting of future states through a prompt-assembly mechanism for LLMs, reducing data needs and mitigating distribution shift. Evaluations on cart-pole and lunar lander benchmarks show competitive safety performance against supervised methods and improvements over standard world models, especially in long-horizon predictions, while highlighting some limitations due to prompt length and data constraints. Overall, the method advances safety-critical robotics by leveraging foundation models to produce interpretable latent states and zero-shot safety predictions with practical implications for data-efficient, safer autonomous systems.
Abstract
A world model creates a surrogate world to train a controller and predict safety violations by learning the internal dynamic model of systems. However, the existing world models rely solely on statistical learning of how observations change in response to actions, lacking precise quantification of how accurate the surrogate dynamics are, which poses a significant challenge in safety-critical systems. To address this challenge, we propose foundation world models that embed observations into meaningful and causally latent representations. This enables the surrogate dynamics to directly predict causal future states by leveraging a training-free large language model. In two common benchmarks, this novel model outperforms standard world models in the safety prediction task and has a performance comparable to supervised learning despite not using any data. We evaluate its performance with a more specialized and system-relevant metric by comparing estimated states instead of aggregating observation-wide error.
