Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation
Shivam Chaubey, Francesco Verdoja, Ville Kyrki
TL;DR
This work addresses the challenge of learning safe trajectories from demonstrations when both the cost function and the safety constraints are unknown. It proposes a two-step optimization that first infers the cost parameters $Q$ and $R$ under the assumption that unknown constraints are intermittently active, then identifies the unknown inequality constraints using the estimated cost via a KKT-based framework. The approach combines demonstration segmentation, cost extraction from inactive segments, inclusive/exclusive constraint representations with a convex relaxation for exclusives, and a constraint-learning formulation solved by semidefinite programming. Results from eight simulated obstacle-avoidance scenarios and a real robotic pouring task with a Panda demonstrate that accurate cost estimation is critical for reliable constraint recovery and safe trajectory generation, while also highlighting limitations in outlier handling and the potential for more expressive constraint models. Overall, the method enables safe, demonstrations-driven trajectory synthesis without requiring prior knowledge of the cost function, with practical implications for robotics safety and autonomous manipulation.
Abstract
Learning from Demonstration allows robots to mimic human actions. However, these methods do not model constraints crucial to ensure safety of the learned skill. Moreover, even when explicitly modelling constraints, they rely on the assumption of a known cost function, which limits their practical usability for task with unknown cost. In this work we propose a two-step optimization process that allow to estimate cost and constraints by decoupling the learning of cost functions from the identification of unknown constraints within the demonstrated trajectories. Initially, we identify the cost function by isolating the effect of constraints on parts of the demonstrations. Subsequently, a constraint leaning method is used to identify the unknown constraints. Our approach is validated both on simulated trajectories and a real robotic manipulation task. Our experiments show the impact that incorrect cost estimation has on the learned constraints and illustrate how the proposed method is able to infer unknown constraints, such as obstacles, from demonstrated trajectories without any initial knowledge of the cost.
