Conditional Variational Auto Encoder Based Dynamic Motion for Multi-task Imitation Learning
Binzhao Xu, Muhayy Ud Din, Irfan Hussain
TL;DR
This work presents a CVAE-enabled Dynamic Motion Primitive (DMP) framework for multi-task imitation learning. By embedding a second-order dynamic system within the CVAE encoder/decoder, the method conditions torque generation on task IDs and via-points, enabling trajectories that adapt to new goals while preserving convergence properties. A data-augmentation–driven training regime and a via-point finetuning mechanism yield accurate endpoints and constraint satisfaction with few demonstrations. Empirical results on handwriting-like trajectories and simulated robotic tasks demonstrate high accuracy, rapid adaptation, and successful transfer to reaching and pushing tasks, highlighting the practical impact for flexible, data-efficient multi-task imitation. The approach offers a pathway to scalable, task-conditioned motion generation with constraints, though automatic selection of trajectory shapes for complex robotic tasks remains an area for future work.
Abstract
The dynamic motion primitive-based (DMP) method is an effective method of learning from demonstrations. However, most of the current DMP-based methods focus on learning one task with one module. Although, some deep learning-based frameworks can learn to multi-task at the same time. However, those methods require a large number of training data and have limited generalization of the learned behavior to the untrained state. In this paper, we propose a framework that combines the advantages of the traditional DMP-based method and conditional variational auto-encoder (CVAE). The encoder and decoder are made of a dynamic system and deep neural network. Deep neural networks are used to generate torque conditioned on the task ID. Then, this torque is used to create the desired trajectory in the dynamic system based on the final state. In this way, the generated tractory can adjust to the new goal position. We also propose a finetune method to guarantee the via-point constraint. Our model is trained on the handwriting number dataset and can be used to solve robotic tasks -- reaching and pushing directly. The proposed model is validated in the simulation environment. The results show that after training on the handwriting number dataset, it achieves a 100\% success rate on pushing and reaching tasks.
