Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation

Shivam Chaubey; Francesco Verdoja; Ville Kyrki

Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation

Shivam Chaubey, Francesco Verdoja, Ville Kyrki

TL;DR

This work addresses the challenge of learning safe trajectories from demonstrations when both the cost function and the safety constraints are unknown. It proposes a two-step optimization that first infers the cost parameters $Q$ and $R$ under the assumption that unknown constraints are intermittently active, then identifies the unknown inequality constraints using the estimated cost via a KKT-based framework. The approach combines demonstration segmentation, cost extraction from inactive segments, inclusive/exclusive constraint representations with a convex relaxation for exclusives, and a constraint-learning formulation solved by semidefinite programming. Results from eight simulated obstacle-avoidance scenarios and a real robotic pouring task with a Panda demonstrate that accurate cost estimation is critical for reliable constraint recovery and safe trajectory generation, while also highlighting limitations in outlier handling and the potential for more expressive constraint models. Overall, the method enables safe, demonstrations-driven trajectory synthesis without requiring prior knowledge of the cost function, with practical implications for robotics safety and autonomous manipulation.

Abstract

Learning from Demonstration allows robots to mimic human actions. However, these methods do not model constraints crucial to ensure safety of the learned skill. Moreover, even when explicitly modelling constraints, they rely on the assumption of a known cost function, which limits their practical usability for task with unknown cost. In this work we propose a two-step optimization process that allow to estimate cost and constraints by decoupling the learning of cost functions from the identification of unknown constraints within the demonstrated trajectories. Initially, we identify the cost function by isolating the effect of constraints on parts of the demonstrations. Subsequently, a constraint leaning method is used to identify the unknown constraints. Our approach is validated both on simulated trajectories and a real robotic manipulation task. Our experiments show the impact that incorrect cost estimation has on the learned constraints and illustrate how the proposed method is able to infer unknown constraints, such as obstacles, from demonstrated trajectories without any initial knowledge of the cost.

Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation

TL;DR

and

under the assumption that unknown constraints are intermittently active, then identifies the unknown inequality constraints using the estimated cost via a KKT-based framework. The approach combines demonstration segmentation, cost extraction from inactive segments, inclusive/exclusive constraint representations with a convex relaxation for exclusives, and a constraint-learning formulation solved by semidefinite programming. Results from eight simulated obstacle-avoidance scenarios and a real robotic pouring task with a Panda demonstrate that accurate cost estimation is critical for reliable constraint recovery and safe trajectory generation, while also highlighting limitations in outlier handling and the potential for more expressive constraint models. Overall, the method enables safe, demonstrations-driven trajectory synthesis without requiring prior knowledge of the cost function, with practical implications for robotics safety and autonomous manipulation.

Abstract

Paper Structure (20 sections, 27 equations, 3 figures, 1 table)

This paper contains 20 sections, 27 equations, 3 figures, 1 table.

Introduction
Related Work
Problem formulation
Optimal trajectory problem
Constraints formulation
Cost and constraints estimation problem
Method
KKT formulation
Cost extraction
Demonstration segmentation
Cost estimation from inactive segments
Constraint extraction
Inequality constraints representation
Complementary exclusive constraints
Constraint learning formulation
...and 5 more sections

Figures (3)

Figure 1: Starting from a human demonstration, we propose a novel method to jointly learn both cost and constraints to allow the robot to replicate the task. UserColor Here, a user demonstrates dropping a ball from a cup to a target; the retrieved constraint limits the cup tilt ($\theta$) when the cup position ($y$) is not over target, as shown in red in the graph.
Figure 2: The environments used in the simulation experiments. Each subcaption indicates the number of unknown constraints ($n_b$) and demonstrations ($L$) for that scenario. Legend: obstacles, optimal trajectory, and learned constraints.
Figure 3: Real manipulator experiment. Legend: collected demonstration, outlier, learned constraint, generated trajectory, trace of normalized $Q^j$, false positive, true positive.

Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation

TL;DR

Abstract

Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)