Table of Contents
Fetching ...

Conditional Variational Auto Encoder Based Dynamic Motion for Multi-task Imitation Learning

Binzhao Xu, Muhayy Ud Din, Irfan Hussain

TL;DR

This work presents a CVAE-enabled Dynamic Motion Primitive (DMP) framework for multi-task imitation learning. By embedding a second-order dynamic system within the CVAE encoder/decoder, the method conditions torque generation on task IDs and via-points, enabling trajectories that adapt to new goals while preserving convergence properties. A data-augmentation–driven training regime and a via-point finetuning mechanism yield accurate endpoints and constraint satisfaction with few demonstrations. Empirical results on handwriting-like trajectories and simulated robotic tasks demonstrate high accuracy, rapid adaptation, and successful transfer to reaching and pushing tasks, highlighting the practical impact for flexible, data-efficient multi-task imitation. The approach offers a pathway to scalable, task-conditioned motion generation with constraints, though automatic selection of trajectory shapes for complex robotic tasks remains an area for future work.

Abstract

The dynamic motion primitive-based (DMP) method is an effective method of learning from demonstrations. However, most of the current DMP-based methods focus on learning one task with one module. Although, some deep learning-based frameworks can learn to multi-task at the same time. However, those methods require a large number of training data and have limited generalization of the learned behavior to the untrained state. In this paper, we propose a framework that combines the advantages of the traditional DMP-based method and conditional variational auto-encoder (CVAE). The encoder and decoder are made of a dynamic system and deep neural network. Deep neural networks are used to generate torque conditioned on the task ID. Then, this torque is used to create the desired trajectory in the dynamic system based on the final state. In this way, the generated tractory can adjust to the new goal position. We also propose a finetune method to guarantee the via-point constraint. Our model is trained on the handwriting number dataset and can be used to solve robotic tasks -- reaching and pushing directly. The proposed model is validated in the simulation environment. The results show that after training on the handwriting number dataset, it achieves a 100\% success rate on pushing and reaching tasks.

Conditional Variational Auto Encoder Based Dynamic Motion for Multi-task Imitation Learning

TL;DR

This work presents a CVAE-enabled Dynamic Motion Primitive (DMP) framework for multi-task imitation learning. By embedding a second-order dynamic system within the CVAE encoder/decoder, the method conditions torque generation on task IDs and via-points, enabling trajectories that adapt to new goals while preserving convergence properties. A data-augmentation–driven training regime and a via-point finetuning mechanism yield accurate endpoints and constraint satisfaction with few demonstrations. Empirical results on handwriting-like trajectories and simulated robotic tasks demonstrate high accuracy, rapid adaptation, and successful transfer to reaching and pushing tasks, highlighting the practical impact for flexible, data-efficient multi-task imitation. The approach offers a pathway to scalable, task-conditioned motion generation with constraints, though automatic selection of trajectory shapes for complex robotic tasks remains an area for future work.

Abstract

The dynamic motion primitive-based (DMP) method is an effective method of learning from demonstrations. However, most of the current DMP-based methods focus on learning one task with one module. Although, some deep learning-based frameworks can learn to multi-task at the same time. However, those methods require a large number of training data and have limited generalization of the learned behavior to the untrained state. In this paper, we propose a framework that combines the advantages of the traditional DMP-based method and conditional variational auto-encoder (CVAE). The encoder and decoder are made of a dynamic system and deep neural network. Deep neural networks are used to generate torque conditioned on the task ID. Then, this torque is used to create the desired trajectory in the dynamic system based on the final state. In this way, the generated tractory can adjust to the new goal position. We also propose a finetune method to guarantee the via-point constraint. Our model is trained on the handwriting number dataset and can be used to solve robotic tasks -- reaching and pushing directly. The proposed model is validated in the simulation environment. The results show that after training on the handwriting number dataset, it achieves a 100\% success rate on pushing and reaching tasks.
Paper Structure (21 sections, 9 equations, 11 figures, 1 table)

This paper contains 21 sections, 9 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Mulit-task imitation learning. Different tasks have different shapes of trajectories. The trajectory for reaching the task looks like a line, while the trajectory for the pushing task is L shape.
  • Figure 2: The main process of our method. Our method contains two stages -- the training stage and the generation stage. At the training stage, a CVAE structure is optimized and conditioned on task ID and via points. The encoder and decoder are made up of a second-order dynamic system and a deep neural network. At the generation stage, the new trajectory is generated by sampling from latent space and task parameters.
  • Figure 3: Data augmentation, the blue line represents the original trajectory, and the colored lines are the trajectories generated using data augmentation. right-side figures represent the forces of these trajectories.
  • Figure 4: Trajectory Generation for different tasks and endpoints. Figure(a), all the trajectories begin at $[0,1 ]$, and end at $[1, 0]$. Figure(b), trajectory ends at different points.
  • Figure 5: Finetune for via point constrain. Figure(a) is the initial trajectory. Figure(b) is the trajectory with via point constrain.
  • ...and 6 more figures