Table of Contents
Fetching ...

DynaMimicGen: A Data Generation Framework for Robot Learning of Dynamic Tasks

Vincenzo Pomponi, Paolo Franceschi, Stefano Baraldo, Loris Roveda, Oliver Avram, Luca Maria Gambardella, Anna Valente

TL;DR

DynaMimicGen (D-MG) introduces a scalable pipeline that converts one or two human demonstrations into large, diverse, and dynamically adaptable datasets for robotic manipulation by segmenting tasks into object-centric subtasks and encoding each with Dynamic Movement Primitives. It enables real-time adaptation to changing object poses and scene geometry, generating data under dynamic disturbances to improve policy robustness. Across Stack, Square, and MugCleanup tasks, D-MG achieves high data-generation rates and yields downstream policies with superior performance compared to MimicGen, demonstrated in both image-based and low-dimensional observations, including real-world experiments with a Franka Panda. The work highlights the practical impact of dynamic data generation for scalable imitation learning, while acknowledging limitations such as the need for known subtask sequences, reliable perception, and single-arm operation, and outlining future directions in obstacle-aware planning, multi-object dependencies, and sim-to-real transfer.

Abstract

Learning robust manipulation policies typically requires large and diverse datasets, the collection of which is time-consuming, labor-intensive, and often impractical for dynamic environments. In this work, we introduce DynaMimicGen (D-MG), a scalable dataset generation framework that enables policy training from minimal human supervision while uniquely supporting dynamic task settings. Given only a few human demonstrations, D-MG first segments the demonstrations into meaningful sub-tasks, then leverages Dynamic Movement Primitives (DMPs) to adapt and generalize the demonstrated behaviors to novel and dynamically changing environments. Improving prior methods that rely on static assumptions or simplistic trajectory interpolation, D-MG produces smooth, realistic, and task-consistent Cartesian trajectories that adapt in real time to changes in object poses, robot states, or scene geometry during task execution. Our method supports different scenarios - including scene layouts, object instances, and robot configurations - making it suitable for both static and highly dynamic manipulation tasks. We show that robot agents trained via imitation learning on D-MG-generated data achieve strong performance across long-horizon and contact-rich benchmarks, including tasks like cube stacking and placing mugs in drawers, even under unpredictable environment changes. By eliminating the need for extensive human demonstrations and enabling generalization in dynamic settings, D-MG offers a powerful and efficient alternative to manual data collection, paving the way toward scalable, autonomous robot learning.

DynaMimicGen: A Data Generation Framework for Robot Learning of Dynamic Tasks

TL;DR

DynaMimicGen (D-MG) introduces a scalable pipeline that converts one or two human demonstrations into large, diverse, and dynamically adaptable datasets for robotic manipulation by segmenting tasks into object-centric subtasks and encoding each with Dynamic Movement Primitives. It enables real-time adaptation to changing object poses and scene geometry, generating data under dynamic disturbances to improve policy robustness. Across Stack, Square, and MugCleanup tasks, D-MG achieves high data-generation rates and yields downstream policies with superior performance compared to MimicGen, demonstrated in both image-based and low-dimensional observations, including real-world experiments with a Franka Panda. The work highlights the practical impact of dynamic data generation for scalable imitation learning, while acknowledging limitations such as the need for known subtask sequences, reliable perception, and single-arm operation, and outlining future directions in obstacle-aware planning, multi-object dependencies, and sim-to-real transfer.

Abstract

Learning robust manipulation policies typically requires large and diverse datasets, the collection of which is time-consuming, labor-intensive, and often impractical for dynamic environments. In this work, we introduce DynaMimicGen (D-MG), a scalable dataset generation framework that enables policy training from minimal human supervision while uniquely supporting dynamic task settings. Given only a few human demonstrations, D-MG first segments the demonstrations into meaningful sub-tasks, then leverages Dynamic Movement Primitives (DMPs) to adapt and generalize the demonstrated behaviors to novel and dynamically changing environments. Improving prior methods that rely on static assumptions or simplistic trajectory interpolation, D-MG produces smooth, realistic, and task-consistent Cartesian trajectories that adapt in real time to changes in object poses, robot states, or scene geometry during task execution. Our method supports different scenarios - including scene layouts, object instances, and robot configurations - making it suitable for both static and highly dynamic manipulation tasks. We show that robot agents trained via imitation learning on D-MG-generated data achieve strong performance across long-horizon and contact-rich benchmarks, including tasks like cube stacking and placing mugs in drawers, even under unpredictable environment changes. By eliminating the need for extensive human demonstrations and enabling generalization in dynamic settings, D-MG offers a powerful and efficient alternative to manual data collection, paving the way toward scalable, autonomous robot learning.

Paper Structure

This paper contains 37 sections, 9 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Overview of the DynaMimicGen system pipeline. (Left) DynaMimicGen begins by selecting the most relevant demonstration (highlighted in green) from the source dataset for the target task. It then identifies the appropriate reference segment (highlighted in red), corresponding to an object-centric subtask, to train a Dynamic Movement Primitive (DMP). (Right) To generate a new demonstration in a novel scene, DynaMimicGen (1) monitors the current state of the environment, (2) transforms the selected segment to generate the DMP goal for the current state configuration, and (3) executes the resulting trajectory. This process is performed with real-time monitoring of the environment to adapt the trajectory in response to dynamic changes.
  • Figure 2: Tasks. We show all of the simulation tasks in the figure above. They span different behaviors including pick-and-place, precise insertion and articulation, and include long-horizon tasks requiring chaining several behaviors together.
  • Figure 3: Initial states sampled from the $D_0$, $D_1$, and $D_2$ distributions for the Square task. Each distribution introduces increasing variability in the positions and orientations of the nut and peg, illustrating how task difficulty and generalization demands grow across the three settings.
  • Figure 4: Dynamic Dataset Generation. Example illustrating how DynaMimicGen (D-MG) introduces controlled perturbations to object positions during task execution. By continuously sensing the environment and adapting trajectories in response to these changes, D-MG generates diverse demonstrations that go beyond the original human demonstration, capturing a wide range of behaviors. This dynamic variation enhances the dataset’s richness, ultimately supporting the training of more robust and generalizable policies for robotic manipulation tasks.
  • Figure 5: Real-World Subtasks. Illustration of the three real-world manipulation tasks evaluated in our experiments: Lift, Stack, and Cleanup. These tasks span different levels of complexity and interaction dynamics, providing a representative benchmark for assessing the robustness and generalization ability of policies trained using DynaMimicGen-generated datasets.