Table of Contents
Fetching ...

Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects

Paul Maria Scheikl, Nicolas Schreiber, Christoph Haas, Niklas Freymuth, Gerhard Neumann, Rudolf Lioutikov, Franziska Mathis-Ullrich

TL;DR

This paper tackles the challenge of data-efficient, gentle manipulation of deformable objects in robot-assisted surgery by introducing Movement Primitive Diffusion (MPD). MPD fuses score-based diffusion over action sequences with Probabilistic Dynamic Movement Primitives to produce multimodal, high-frequency trajectories that respect boundary conditions. Empirical results on four LapGym tasks across simulation and real hardware show that MPD achieves higher success rates and superior motion quality with fewer demonstrations than state-of-the-art baselines, including diffusion-based policies and BESO. The approach holds promise for practical surgical autonomy by delivering safe, efficient, and adaptable manipulation of delicate tissues in visually rich settings.

Abstract

Policy learning in robot-assisted surgery (RAS) lacks data efficient and versatile methods that exhibit the desired motion quality for delicate surgical interventions. To this end, we introduce Movement Primitive Diffusion (MPD), a novel method for imitation learning (IL) in RAS that focuses on gentle manipulation of deformable objects. The approach combines the versatility of diffusion-based imitation learning (DIL) with the high-quality motion generation capabilities of Probabilistic Dynamic Movement Primitives (ProDMPs). This combination enables MPD to achieve gentle manipulation of deformable objects, while maintaining data efficiency critical for RAS applications where demonstration data is scarce. We evaluate MPD across various simulated and real world robotic tasks on both state and image observations. MPD outperforms state-of-the-art DIL methods in success rate, motion quality, and data efficiency. Project page: https://scheiklp.github.io/movement-primitive-diffusion/

Movement Primitive Diffusion: Learning Gentle Robotic Manipulation of Deformable Objects

TL;DR

This paper tackles the challenge of data-efficient, gentle manipulation of deformable objects in robot-assisted surgery by introducing Movement Primitive Diffusion (MPD). MPD fuses score-based diffusion over action sequences with Probabilistic Dynamic Movement Primitives to produce multimodal, high-frequency trajectories that respect boundary conditions. Empirical results on four LapGym tasks across simulation and real hardware show that MPD achieves higher success rates and superior motion quality with fewer demonstrations than state-of-the-art baselines, including diffusion-based policies and BESO. The approach holds promise for practical surgical autonomy by delivering safe, efficient, and adaptable manipulation of delicate tissues in visually rich settings.

Abstract

Policy learning in robot-assisted surgery (RAS) lacks data efficient and versatile methods that exhibit the desired motion quality for delicate surgical interventions. To this end, we introduce Movement Primitive Diffusion (MPD), a novel method for imitation learning (IL) in RAS that focuses on gentle manipulation of deformable objects. The approach combines the versatility of diffusion-based imitation learning (DIL) with the high-quality motion generation capabilities of Probabilistic Dynamic Movement Primitives (ProDMPs). This combination enables MPD to achieve gentle manipulation of deformable objects, while maintaining data efficiency critical for RAS applications where demonstration data is scarce. We evaluate MPD across various simulated and real world robotic tasks on both state and image observations. MPD outperforms state-of-the-art DIL methods in success rate, motion quality, and data efficiency. Project page: https://scheiklp.github.io/movement-primitive-diffusion/
Paper Structure (14 sections, 8 equations, 7 figures, 1 table)

This paper contains 14 sections, 8 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Schematic for action sequence generation with MPD for bimanual tissue manipulation. Observations $o$ and initial values $s_0$ for position and velocity are captured on the bimanual robotic setup. An ODE solver solves the Probability Flow ODE with learnable model $E_\Theta$ and ProDMP $P$ by iteratively denoising an action sequence $\tilde{\tau}^k$ for diffusion step $k$ and respective noise level $t$. The final denoised action sequence $\tau^0$ is executed on the robots.
  • Figure 2: Start, intermediate, and end state of the tasks in simulation. The final column shows the respective real world experiment. Grasp Lift Touch (GLT) requires sequential collaboration between instruments, Rope Threading (RT) and Ligating Loop (LL) depend on accurate alignment deformable ropes, and Bimanual Tissue Manipulation (BTM) requires concurrent collaboration between instruments to control the shape of a deformable tissue.
  • Figure 3: Example images from the real world task. The deformation behavior of the tissue differs significantly from the simulated variant. When stretched, it throws folds and has large variability when slacking without tension, e.g., bulging forward or folding in.
  • Figure 4: Cartesian instrument positions of trajectories generated in reference to human demonstrations on the simulated Bimanual Tissue Manipulation task. In contrast to the baselines, MPD consistently generates smooth trajectories.
  • Figure 5: Success rate of state- and image-based policies on the real world task in relation to the distance threshold $T$ between marker and target.
  • ...and 2 more figures