Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning

Piotr Kicki; Davide Tateo; Puze Liu; Jonas Guenster; Jan Peters; Krzysztof Walas

Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning

Piotr Kicki, Davide Tateo, Puze Liu, Jonas Guenster, Jan Peters, Krzysztof Walas

TL;DR

This work addresses planning under kinodynamic and safety constraints by integrating learning-to-plan with safe reinforcement learning and motion-primitives. It introduces a constrained trajectory generation framework that uses learnable Lagrange multipliers to balance reward and constraint satisfaction, while decoupling task optimization from constraint handling. The authors advocate a B-spline trajectory representation with fixed knots and time scaling to tightly couple boundary conditions and priors, enabling efficient, safe learning and online composition of segments. Empirical results on heavy-object manipulation and robot air hockey show superior performance and safety compliance compared with state-of-the-art baselines, including successful zero-shot transfer to a real robot. Overall, the approach demonstrates a practical path to deploying dynamic, constraint-aware robotic behaviors in complex, unknown environments without requiring full task models.

Abstract

Trajectory planning under kinodynamic constraints is fundamental for advanced robotics applications that require dexterous, reactive, and rapid skills in complex environments. These constraints, which may represent task, safety, or actuator limitations, are essential for ensuring the proper functioning of robotic platforms and preventing unexpected behaviors. Recent advances in kinodynamic planning demonstrate that learning-to-plan techniques can generate complex and reactive motions under intricate constraints. However, these techniques necessitate the analytical modeling of both the robot and the entire task, a limiting assumption when systems are extremely complex or when constructing accurate task models is prohibitive. This paper addresses this limitation by combining learning-to-plan methods with reinforcement learning, resulting in a novel integration of black-box learning of motion primitives and optimization. We evaluate our approach against state-of-the-art safe reinforcement learning methods, showing that our technique, particularly when exploiting task structure, outperforms baseline methods in challenging scenarios such as planning to hit in robot air hockey. This work demonstrates the potential of our integrated approach to enhance the performance and safety of robots operating under complex kinodynamic constraints.

Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning

TL;DR

Abstract

Paper Structure (23 sections, 6 equations, 7 figures, 7 tables, 1 algorithm)

This paper contains 23 sections, 6 equations, 7 figures, 7 tables, 1 algorithm.

Introduction
Constrained Reinforcement Learning with Motion Primitives
Learning under known constraints
Motion Primitives for Safe Reinforcement Learning
Experimental Evaluation
Heavy object task
Robot Air Hockey Hitting
Limitations
Conclusion
Environments
Heavy Object
Task definition.
Air Hockey Hitting
Task definition.
Air Hockey real robot deployment
...and 8 more sections

Figures (7)

Figure 1: Overview of the proposed constrained trajectory generation method.
Figure 2: Learning curves (reward w.r.t. number of simulation steps) for the: (a) heavy object task without prior knowledge, (b) with prior knowledge, and (c) air hockey hitting task.
Figure 3: Statistical analysis of the considered approaches on the simulated Air Hockey hitting task.
Figure 4: Statistical comparison of and on the Air Hockey hitting with real robot.
Figure 5: Visualization of the task of moving a heavy object.
...and 2 more figures

Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning

TL;DR

Abstract

Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)