SafeFlowMPC: Predictive and Safe Trajectory Planning for Robot Manipulators with Learning-based Policies

Thies Oelerich; Gerald Ebmer; Christian Hartl-Nesic; Andreas Kugi

SafeFlowMPC: Predictive and Safe Trajectory Planning for Robot Manipulators with Learning-based Policies

Thies Oelerich, Gerald Ebmer, Christian Hartl-Nesic, Andreas Kugi

TL;DR

SafeFlowMPC addresses safe, real-time trajectory planning for robot manipulators by fusing learning-based flow matching with online model-predictive control. It defines safety and performance manifolds and enforces safety through iterative flow steps and trajectory projection, guaranteeing safety via a terminal constraint while allowing reactive planning. The approach is validated on a 7-DoF KUKA manipulator across global-to-local planning, online grasping, and dynamic handover tasks, showing competitive efficiency, high success rates, and strong safety guarantees compared to baselines. This work advances practical safe learning-based planning for manipulators operating in dynamic environments, enabling reliable interaction with humans and objects in real time.

Abstract

The emerging integration of robots into everyday life brings several major challenges. Compared to classical industrial applications, more flexibility is needed in combination with real-time reactivity. Learning-based methods can train powerful policies based on demonstrated trajectories, such that the robot generalizes a task to similar situations. However, these black-box models lack interpretability and rigorous safety guarantees. Optimization-based methods provide these guarantees but lack the required flexibility and generalization capabilities. This work proposes SafeFlowMPC, a combination of flow matching and online optimization to combine the strengths of learning and optimization. This method guarantees safety at all times and is designed to meet the demands of real-time execution by using a suboptimal model-predictive control formulation. SafeFlowMPC achieves strong performance in three real-world experiments on a KUKA 7-DoF manipulator, namely two grasping experiment and a dynamic human-robot object handover experiment. A video of the experiments is available at http://www.acin.tuwien.ac.at/42d6. The code is available at https://github.com/TU-Wien-ACIN-CDS/SafeFlowMPC.

SafeFlowMPC: Predictive and Safe Trajectory Planning for Robot Manipulators with Learning-based Policies

TL;DR

Abstract

Paper Structure (15 sections, 2 theorems, 15 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 2 theorems, 15 equations, 6 figures, 3 tables, 1 algorithm.

INTRODUCTION
Formulation
Safety Manifolds
Flow Matching on Manifolds
Mathematical Formulation
Training Procedure
Trajectory projection
Inference
Practical Implementation: Robot Manipulator
Experiments
Experiment 1: Global trajectory planning made local
Experiment 2: Online replanning for object grasps
Experiment 3: Dynamic human-robot object handover
Limitations
Conclusions and Future Work

Key Result

Theorem 1

The projected trajectory $\boldsymbol{q} _{\mathrm{proj}}(t) = \mathcal{P}^{\mathrm{safe}}_{ \boldsymbol{q} }( \boldsymbol{q} _{\mathrm{init}}(t))$ extending from the initial time $t = t_{0}$ to the end time $t = t_{0} + T$ is safe for all times $t > t_{0} + T$. Safety at time $t$ is defined by

Figures (6)

Figure 1: Visual explanation of the optimization scheme of the proposed method. At time step $i$ the trajectory $\boldsymbol{q} ^{i}_{s}(t)$ on the safety manifold $\mathcal{M}_{\mathrm{safe}}$ is improved by a flow matching step (red) and projected back onto $\mathcal{M}_{\mathrm{safe}}$ (blue) to obtain the safe trajectory $\boldsymbol{q} ^{i}_{\mathrm{s + \Delta s}}(t)$. These two steps are executed multiple times for each time step (dashed lines) to traverse the flow from $s = 0$ to $s = 1$.
Figure 2: Planning scheme of SafeFlowMPC for the trajectory planning of a robot manipulator around an obstacle $\mathcal{O}$. At step $i$, the robot plans the trajectory $\boldsymbol{q} _{1}^{i}(t)$ on the safety manifold $\mathcal{M}^{i}_{\mathrm{safe}}$. This trajectory ends in the terminal safety set $\mathcal{S}_{T}^{i}$ indicated by the blue shaded area. At step $i+1$ the planner reuses the previous trajectory according to \ref{['eq:next_traj']} and transfers it (red shaded area) to $\boldsymbol{q} _{1}^{i+1}(t)$ on $\mathcal{M}^{i+1}_{\mathrm{safe}}$.
Figure 3: Experiment 1: Environment with red obstacles and green object to grasp. An example trajectory for three planners is shown. The robot is visualized for the start and end configuration of the SafeFlowMPC trajectory.
Figure 4: Experiment 1: Maximum values of the normalized joint velocity for the example trajectories in \ref{['fig:scen_global_planner']}.
Figure 5: Experiment 1: Norm of joint velocity $\dot{ \boldsymbol{q} }_{T}$, acceleration $\ddot{ \boldsymbol{q} }_{T}$, and jerk ${\mathop{@ \boldsymbol{q} }\limits^{\hbox{\ex@{\tw@\ex@ \normalfont...}}}}_{T}$ at the end of each planning horizon using SafeFlowMPC for the example trajectory in \ref{['fig:scen_global_planner']}.
...and 1 more figures

Theorems & Definitions (4)

Theorem 1
proof
Theorem 2
proof

SafeFlowMPC: Predictive and Safe Trajectory Planning for Robot Manipulators with Learning-based Policies

TL;DR

Abstract

SafeFlowMPC: Predictive and Safe Trajectory Planning for Robot Manipulators with Learning-based Policies

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (4)