Table of Contents
Fetching ...

Conditional Neural Expert Processes for Learning Movement Primitives from Demonstration

Yigit Yildirim, Emre Ugur

TL;DR

This work proposes an LfD framework, namely the Conditional Neural Expert Processes (CNEP), that learns to assign demonstrations from different modes to distinct expert networks utilizing the inherent information within the latent space to match experts with the encoded representations.

Abstract

Learning from Demonstration (LfD) is a widely used technique for skill acquisition in robotics. However, demonstrations of the same skill may exhibit significant variances, or learning systems may attempt to acquire different means of the same skill simultaneously, making it challenging to encode these motions into movement primitives. To address these challenges, we propose an LfD framework, namely the Conditional Neural Expert Processes (CNEP), that learns to assign demonstrations from different modes to distinct expert networks utilizing the inherent information within the latent space to match experts with the encoded representations. CNEP does not require supervision on which mode the trajectories belong to. We compare the performance of CNEP against widely used and powerful LfD methods such as Gaussian Mixture Models, Probabilistic Movement Primitives, and Stable Movement Primitives and show that our method outperforms these baselines on multimodal trajectory datasets. The results reveal enhanced modeling performance for movement primitives, leading to the synthesis of trajectories that more accurately reflect those demonstrated by experts, particularly when the skill demonstrations include intersection points from various trajectories. We evaluated the CNEP model on two real-robot tasks, namely obstacle avoidance and pick-and-place tasks, that require the robot to learn multi-modal motion trajectories and execute the correct primitives given target environment conditions. We also showed that our system is capable of on-the-fly adaptation to environmental changes via an online conditioning mechanism. Lastly, we believe that CNEP offers improved explainability and interpretability by autonomously finding discrete behavior primitives and providing probability values about its expert selection decisions.

Conditional Neural Expert Processes for Learning Movement Primitives from Demonstration

TL;DR

This work proposes an LfD framework, namely the Conditional Neural Expert Processes (CNEP), that learns to assign demonstrations from different modes to distinct expert networks utilizing the inherent information within the latent space to match experts with the encoded representations.

Abstract

Learning from Demonstration (LfD) is a widely used technique for skill acquisition in robotics. However, demonstrations of the same skill may exhibit significant variances, or learning systems may attempt to acquire different means of the same skill simultaneously, making it challenging to encode these motions into movement primitives. To address these challenges, we propose an LfD framework, namely the Conditional Neural Expert Processes (CNEP), that learns to assign demonstrations from different modes to distinct expert networks utilizing the inherent information within the latent space to match experts with the encoded representations. CNEP does not require supervision on which mode the trajectories belong to. We compare the performance of CNEP against widely used and powerful LfD methods such as Gaussian Mixture Models, Probabilistic Movement Primitives, and Stable Movement Primitives and show that our method outperforms these baselines on multimodal trajectory datasets. The results reveal enhanced modeling performance for movement primitives, leading to the synthesis of trajectories that more accurately reflect those demonstrated by experts, particularly when the skill demonstrations include intersection points from various trajectories. We evaluated the CNEP model on two real-robot tasks, namely obstacle avoidance and pick-and-place tasks, that require the robot to learn multi-modal motion trajectories and execute the correct primitives given target environment conditions. We also showed that our system is capable of on-the-fly adaptation to environmental changes via an online conditioning mechanism. Lastly, we believe that CNEP offers improved explainability and interpretability by autonomously finding discrete behavior primitives and providing probability values about its expert selection decisions.
Paper Structure (23 sections, 7 equations, 10 figures, 3 tables)

This paper contains 23 sections, 7 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: The CNEP model contains an Encoder, a Gate, and multiple Query Networks called experts. In this example, n=3 observation points (shown with $\bullet$) and m=1 target timepoints (shown with $\bullet$) are randomly sampled on the input trajectory. The latent representation, the mean of $n$ observation encodings, is used by: (1) the Gate Network to find the responsible expert and (2) expert networks to generate their predictions as normal distributions at the target timepoint. Refer to Section \ref{['sec:cnep_arch']} for details.
  • Figure 2: Initially, observation points from the input trajectory are mapped to the latent space by the Encoder Network. The averaged representations ($\mathbf{r}$) are (1) fed into the Gate Network and (2) concatenated with target points ($\mathbf{r_q}$) and fed into the Query Networks. While the candidate predictions are outputted by all Query Networks, the probabilities for each trajectory-expert pair ($\mathbf{p}$) are generated by the Gate Network. These probabilities are used in calculating the values of loss components as explained in Section \ref{['sec:cnep_train']}.
  • Figure 3: (a) SM trajectories and observation points used in the comparison. (b) Conditioned from different points. While CNMP might produce an average response, CNEP successfully generates the target trajectory.
  • Figure 4: The left column presents datasets of sensorimotor trajectories with increasing complexities. Correspondingly, the right column presents synthesized trajectories on an example run upon training. Modeling the skills these trajectories realize becomes more challenging as the number of modalities increases. However, different experts inside the CNEP model can successfully handle the increasing complexity.
  • Figure 5: Demonstrations of the obstacle avoidance skill are being performed by an expert. Kinesthetic teaching is used to generate sensorimotor demonstrations. Later, this data is used to train CNEP and CNMP models.
  • ...and 5 more figures