ExTraCT -- Explainable Trajectory Corrections from language inputs using Textual description of features
J-Anne Yow, Neha Priyadarshini Garg, Manoj Ramanathan, Wei Tech Ang
TL;DR
ExTraCT addresses the problem of modifying robot trajectories from natural language by grounding corrections into a finite set of interpretable trajectory modification features via textual descriptions and semantic similarity. It separates language understanding from trajectory deformation, resulting in axiomatic trajectory updates where $ξ^* = δ(φ^*, ξ_0, E)$ with $φ^* = \arg\max_{φ ∈ Φ} P(φ|l)$ and a subsequent optimizer enforcing kinodynamic constraints. Features are categorized as scene-specific or scene-independent, described by templates $T_φ$ and embedded to compute $P(T_φ|l) ∝ \max_{t_φ∈T_φ} q(t_φ)^T q(l) / (||q(t_φ)|| \cdot ||q(l)||)$. Empirical results across simulation and real-robot experiments show ExTraCT achieving higher accuracy and stronger user preference than end-to-end baselines, while offering improved interpretability and generalization to unseen trajectories and object configurations, demonstrated in tasks including manipulation and assistive feeding.
Abstract
Natural language provides an intuitive and expressive way of conveying human intent to robots. Prior works employed end-to-end methods for learning trajectory deformations from language corrections. However, such methods do not generalize to new initial trajectories or object configurations. This work presents ExTraCT, a modular framework for trajectory corrections using natural language that combines Large Language Models (LLMs) for natural language understanding and trajectory deformation functions. Given a scene, ExTraCT generates the trajectory modification features (scene-specific and scene-independent) and their corresponding natural language textual descriptions for the objects in the scene online based on a template. We use LLMs for semantic matching of user utterances to the textual descriptions of features. Based on the feature matched, a trajectory modification function is applied to the initial trajectory, allowing generalization to unseen trajectories and object configurations. Through user studies conducted both in simulation and with a physical robot arm, we demonstrate that trajectories deformed using our method were more accurate and were preferred in about 80\% of cases, outperforming the baseline. We also showcase the versatility of our system in a manipulation task and an assistive feeding task.
