An Alignment-Based Approach to Learning Motions from Demonstrations
Alex Cuellar, Christopher K Fourie, Julie A Shah
TL;DR
This work addresses the limitation of traditional LfD methods in learning from few demonstrations by introducing CALM, an alignment-based framework that uses a mean-trajectory representation to bridge time-independent and time-dependent approaches. CALM computes robot velocity through a gradient of a mixture-of-Gaussians alignment score, with alignment updated by an HMM-based forward model and demonstrated via TRACER clustering to enable multi-modal behavior. The key contributions include an alignment-dependent controller with stability guarantees, an HMM alignment mechanism robust to perturbations, and a cluster-selection scheme to follow the most appropriate mean trajectory, all validated on 2D datasets and three 7-DoF robot tasks. Empirically, CALM improves overlap handling, perturbation recovery, and multi-cluster switching, offering a practical, provably stable alternative for learning motions from demonstrations.
Abstract
Learning from Demonstration (LfD) has shown to provide robots with fundamental motion skills for a variety of domains. Various branches of LfD research (e.g., learned dynamical systems and movement primitives) can generally be classified into ''time-dependent'' or ''time-independent'' systems. Each provides fundamental benefits and drawbacks -- time-independent methods cannot learn overlapping trajectories, while time-dependence can result in undesirable behavior under perturbation. This paper introduces Cluster Alignment for Learned Motions (CALM), an LfD framework dependent upon an alignment with a representative ''mean" trajectory of demonstrated motions rather than pure time- or state-dependence. We discuss the convergence properties of CALM, introduce an alignment technique able to handle the shifts in alignment possible under perturbation, and utilize demonstration clustering to generate multi-modal behavior. We show how CALM mitigates the drawbacks of time-dependent and time-independent techniques on 2D datasets and implement our system on a 7-DoF robot learning tasks in three domains.
