Table of Contents
Fetching ...

From Demonstrations to Safe Deployment: Path-Consistent Safety Filtering for Diffusion Policies

Ralf Römer, Julian Balletshofer, Jakob Thumm, Marco Pavone, Angela P. Schoellig, Matthias Althoff

TL;DR

This work introduces Path-Consistent Safety Filtering (PACS) to safely deploy diffusion policies in dynamic environments by translating action chunks into intended trajectories and verifying safety with set-based reachability while staying on the policy's path. PACS provides formal safety guarantees at real-time rates (≈1 kHz) and maintains task performance, outperforming reactive safety mechanisms such as control barrier functions in both simulation (up to 68% higher task success) and real-world human-robot interaction tasks (up to 37% higher in Sorting). The method relies on an intermediate trajectory planning module that aligns safety checks with the policy’s action chunks, enabling fine-grained, path-consistent braking and failsafe behavior without leaving the learned data distribution. Experimental results on Robomimic benchmarks and real hardware demonstrate PACS’s effectiveness in preventing unsafe states while preserving high task success across three HRI scenarios (Sorting, Handover, Feeding), illustrating practical impact for safety-critical robotics deployment.

Abstract

Diffusion policies (DPs) achieve state-of-the-art performance on complex manipulation tasks by learning from large-scale demonstration datasets, often spanning multiple embodiments and environments. However, they cannot guarantee safe behavior, so external safety mechanisms are needed. These, however, alter actions in ways unseen during training, causing unpredictable behavior and performance degradation. To address these problems, we propose path-consistent safety filtering (PACS) for DPs. Our approach performs path-consistent braking on a trajectory computed from the sequence of generated actions. In this way, we keep execution consistent with the policy's training distribution, maintaining the learned, task-completing behavior. To enable a real-time deployment and handle uncertainties, we verify safety using set-based reachability analysis. Our experimental evaluation in simulation and on three challenging real-world human-robot interaction tasks shows that PACS (a) provides formal safety guarantees in dynamic environments, (b) preserves task success rates, and (c) outperforms reactive safety approaches, such as control barrier functions, by up to 68% in terms of task success. Videos are available at our project website: https://tum-lsy.github.io/pacs/.

From Demonstrations to Safe Deployment: Path-Consistent Safety Filtering for Diffusion Policies

TL;DR

This work introduces Path-Consistent Safety Filtering (PACS) to safely deploy diffusion policies in dynamic environments by translating action chunks into intended trajectories and verifying safety with set-based reachability while staying on the policy's path. PACS provides formal safety guarantees at real-time rates (≈1 kHz) and maintains task performance, outperforming reactive safety mechanisms such as control barrier functions in both simulation (up to 68% higher task success) and real-world human-robot interaction tasks (up to 37% higher in Sorting). The method relies on an intermediate trajectory planning module that aligns safety checks with the policy’s action chunks, enabling fine-grained, path-consistent braking and failsafe behavior without leaving the learned data distribution. Experimental results on Robomimic benchmarks and real hardware demonstrate PACS’s effectiveness in preventing unsafe states while preserving high task success across three HRI scenarios (Sorting, Handover, Feeding), illustrating practical impact for safety-critical robotics deployment.

Abstract

Diffusion policies (DPs) achieve state-of-the-art performance on complex manipulation tasks by learning from large-scale demonstration datasets, often spanning multiple embodiments and environments. However, they cannot guarantee safe behavior, so external safety mechanisms are needed. These, however, alter actions in ways unseen during training, causing unpredictable behavior and performance degradation. To address these problems, we propose path-consistent safety filtering (PACS) for DPs. Our approach performs path-consistent braking on a trajectory computed from the sequence of generated actions. In this way, we keep execution consistent with the policy's training distribution, maintaining the learned, task-completing behavior. To enable a real-time deployment and handle uncertainties, we verify safety using set-based reachability analysis. Our experimental evaluation in simulation and on three challenging real-world human-robot interaction tasks shows that PACS (a) provides formal safety guarantees in dynamic environments, (b) preserves task success rates, and (c) outperforms reactive safety approaches, such as control barrier functions, by up to 68% in terms of task success. Videos are available at our project website: https://tum-lsy.github.io/pacs/.

Paper Structure

This paper contains 20 sections, 7 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Deploying DPs in dynamic environments with moving objects requires safeguarding mechanisms, as the intended policy actions may be unsafe. Reactive strategies, such as control barrier functions, often drive the agent into out-of-distribution (OOD) states not seen during training, leading to unpredictable behavior. We propose that safety mechanisms for DPs should remain consistent with the robot’s intended path to avoid out-of-distribution states and preserve high task success rates.
  • Figure 2: System overview of our proposed path-consistent safety filter (PACS). The policy, conditioned on visual observations and proprioceptive inputs, generates action chunks that are transformed into a sequence of desired waypoints. From these waypoints, we compute a kinematically and dynamically feasible intended trajectory. PACS continuously monitors this trajectory and applies high-frequency safety filtering using reachability analysis to enforce task-specific safety constraints (e.g., collision avoidance or impact force limits).
  • Figure 3: Visualization of our three real-world tasks, which require safe and reactive motion in close proximity to the human body. Sorting represents a coexistence task where no collisions are permitted, whereas Handover and Feeding are collaboration tasks requiring non-harmful, low-force interactions.
  • Figure 4: End effector paths for the Sorting task. The color intensity of the trajectories indicates the velocity, and the intensity of the shaded grey areas visualizes the training distribution. Our safety filter slows down the policy without leaving the desired path when the human is nearby. In contrast, the control barrier function (CBF) pushes the robot away from unsafe states, which often leads to out-of-distribution (OOD) states from which the policy cannot recover.