Table of Contents
Fetching ...

Zero-shot Safety Prediction for Autonomous Robots with Foundation World Models

Zhenjiang Mao, Siqi Dai, Yuang Geng, Ivan Ruchkin

TL;DR

This work tackles safety prediction for autonomous robots using world models by introducing foundation world models that align observations with causally meaningful latent representations via segmentation (SAM) and a training-free latent predictor (LLM). The approach replaces opaque pixel-based evaluation with an object-centric centroid-distance metric and enables zero-shot forecasting of future states through a prompt-assembly mechanism for LLMs, reducing data needs and mitigating distribution shift. Evaluations on cart-pole and lunar lander benchmarks show competitive safety performance against supervised methods and improvements over standard world models, especially in long-horizon predictions, while highlighting some limitations due to prompt length and data constraints. Overall, the method advances safety-critical robotics by leveraging foundation models to produce interpretable latent states and zero-shot safety predictions with practical implications for data-efficient, safer autonomous systems.

Abstract

A world model creates a surrogate world to train a controller and predict safety violations by learning the internal dynamic model of systems. However, the existing world models rely solely on statistical learning of how observations change in response to actions, lacking precise quantification of how accurate the surrogate dynamics are, which poses a significant challenge in safety-critical systems. To address this challenge, we propose foundation world models that embed observations into meaningful and causally latent representations. This enables the surrogate dynamics to directly predict causal future states by leveraging a training-free large language model. In two common benchmarks, this novel model outperforms standard world models in the safety prediction task and has a performance comparable to supervised learning despite not using any data. We evaluate its performance with a more specialized and system-relevant metric by comparing estimated states instead of aggregating observation-wide error.

Zero-shot Safety Prediction for Autonomous Robots with Foundation World Models

TL;DR

This work tackles safety prediction for autonomous robots using world models by introducing foundation world models that align observations with causally meaningful latent representations via segmentation (SAM) and a training-free latent predictor (LLM). The approach replaces opaque pixel-based evaluation with an object-centric centroid-distance metric and enables zero-shot forecasting of future states through a prompt-assembly mechanism for LLMs, reducing data needs and mitigating distribution shift. Evaluations on cart-pole and lunar lander benchmarks show competitive safety performance against supervised methods and improvements over standard world models, especially in long-horizon predictions, while highlighting some limitations due to prompt length and data constraints. Overall, the method advances safety-critical robotics by leveraging foundation models to produce interpretable latent states and zero-shot safety predictions with practical implications for data-efficient, safer autonomous systems.

Abstract

A world model creates a surrogate world to train a controller and predict safety violations by learning the internal dynamic model of systems. However, the existing world models rely solely on statistical learning of how observations change in response to actions, lacking precise quantification of how accurate the surrogate dynamics are, which poses a significant challenge in safety-critical systems. To address this challenge, we propose foundation world models that embed observations into meaningful and causally latent representations. This enables the surrogate dynamics to directly predict causal future states by leveraging a training-free large language model. In two common benchmarks, this novel model outperforms standard world models in the safety prediction task and has a performance comparable to supervised learning despite not using any data. We evaluate its performance with a more specialized and system-relevant metric by comparing estimated states instead of aggregating observation-wide error.
Paper Structure (13 sections, 6 equations, 4 figures, 6 tables, 2 algorithms)

This paper contains 13 sections, 6 equations, 4 figures, 6 tables, 2 algorithms.

Figures (4)

  • Figure 1: A dynamical model (blue flow) vs. a world model (orange flow).
  • Figure 2: The structure of an existing world model (left) and the proposed foundation world model (right).
  • Figure 3: An example of segmentation matrices. Upper, from left to right: segmentation of the cart, pole, lower background, and upper background. Lower, from left to right: segmentation of the lander, lander point flag, lower background, and upper background.
  • Figure 4: The observation of cart pole (upper) and lunar lander (lower). From left to right, generated by: the true observation model, a foundation world model, an existing world model without distribution shift, and an existing world model with distribution shift.