Learning Safe-Stoppability Monitors for Humanoid Robots

Yifan Sun; Yiyuan Pan; Shangtao Li; Caiwu Ding; Tao Cui; Lingyun Wang; Changliu Liu

Learning Safe-Stoppability Monitors for Humanoid Robots

Yifan Sun, Yiyuan Pan, Shangtao Li, Caiwu Ding, Tao Cui, Lingyun Wang, Changliu Liu

Abstract

Emergency stop (E-stop) mechanisms are the de facto standard for robot safety. However, for humanoid robots, abruptly cutting power can itself cause catastrophic failures; instead, an emergency stop must execute a predefined fallback controller that preserves balance and drives the robot toward a minimum-risk condition. This raises a critical question: from which states can a humanoid robot safely execute such a stop? In this work, we formalize emergency stopping for humanoids as a policy-dependent safe-stoppability problem and use data-driven approaches to characterize the safe-stoppable envelope. We introduce PRISM (Proactive Refinement of Importance-sampled Stoppability Monitor), a simulation-driven framework that learns a neural predictor for state-level stoppability. PRISM iteratively refines the decision boundary using importance sampling, enabling targeted exploration of rare but safety-critical states. This targeted exploration significantly improves data efficiency while reducing false-safe predictions under a fixed simulation budget. We further demonstrate sim-to-real transfer by deploying the pretrained monitor on a real humanoid platform. Results show that modeling safety as policy-dependent stoppability enables proactive safety monitoring and supports scalable certification of fail-safe behaviors for humanoid robots.

Learning Safe-Stoppability Monitors for Humanoid Robots

Abstract

Paper Structure (25 sections, 7 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 7 equations, 5 figures, 5 tables, 1 algorithm.

INTRODUCTION
Related Work
Fail-Safe Safety and Minimum Risk Conditions
Reach-Avoid Analysis for Fail-Safe Control
Sim-to-Real Deployment for Humanoid Robots
Problem Formulation
System Model
Safe-Stoppable Envelop (SSE) and Stoppability Monitor
Simulation-Based Stoppability Estimation
Efficient Safe-Stop Outcome Labeling
Learning the Stoppability Monitor
Iterative Refinement via Importance Sampling
Initial Dataset Construction
Data Refinement via Importance Sampling
Experiments
...and 10 more sections

Figures (5)

Figure 1: Safe-stoppability monitoring for humanoid robots. Emergency stops for humanoids cannot simply cut power; instead, a predefined fallback controller is triggered to drive the robot toward a minimum-risk condition (MRC). We simulate fallback executions from nominal states in a digital twin to label whether the robot can safely reach the terminal configuration without collision or failure. These labels are used to train a neural monitor that predicts the real-time safe-stoppability confidence of the current state. The monitor can not only discourage pre-mature execution of the fallback policy, but also enable proactive intervention by triggering the fallback policy before the robot enters states from which safe stopping is no longer possible.
Figure 2: Data-efficient stoppability monitoring via PRISM. Exhaustively collecting failure data to learn SSE boundaries on physical humanoids is prohibitively expensive and risks severe hardware damage. To overcome this labeling bottleneck, PRISM dynamically tracks prediction errors and allocates dense sampling to highly uncertain boundary regions (the IS Window). By focusing on critical unstoppable states rather than trivially safe regions, PRISM reduces data requirements while improving boundary characterization. Red dots denote counts, while the shaded background indicates density values.
Figure 3: Experimental platforms used: (Left) Real humanoid deployment. (Middle) AR based of data collection. (Right) Simulation-based fallback policy rollouts. AR-based data collection improves safety during real-robot experiments by projecting the virtual environment onto the real robot, enabling realistic data collection while performing collision checking in simulation to avoid physical collisions.
Figure 4: Safety score over a full loco-manipulation trajectory. The monitor exhibits precise temporal alignment with the ground truth and identifies elevated risks during manipulation phases (Pick, Place) while ensuring high confidence during steady-state locomotion (Transfer, Leave).
Figure 5: Spatial projection of the stoppability boundaries. For both (a) Real-World and (b) Simulation, the scatter plots (left panels) display the Cartesian $(x,y)$ state distributions colored by ground-truth labels. The contour maps (right panels) illustrate the interpolated predictive safety regions overlaid with state density.

Learning Safe-Stoppability Monitors for Humanoid Robots

Abstract

Learning Safe-Stoppability Monitors for Humanoid Robots

Authors

Abstract

Table of Contents

Figures (5)