Vision-Based Dexterous Motion Planning by Dynamic Movement Primitives with Human Hand Demonstration
Nuo Chen, Ya-Jun Pan
TL;DR
This work presents a vision-based framework to enable dexterous manipulation by learning from human hand demonstrations. A depth camera coupled with MediaPipe reconstructs 3D hand coordinates and orientation, including grasp state, which are smoothed with a mean filter and then learned using a Modified Dynamic Movement Primitive to generalize trajectories to new start/end points. The robot employs impedance control for robust trajectory tracking in Cartesian space, achieving hand-tracking accuracy with a maximum error of $\le 2\text{ cm}$ and demonstrating dexterous pick-and-place around obstacles into a sloped container. The approach advances robot learning from human demonstrations by integrating full hand posture information and a decoupled DMP, enabling flexible, vision-driven manipulation on a 7-DOF platform with practical impact for collaborative robotics.
Abstract
This paper proposes a vision-based framework for a 7-degree-of-freedom robotic manipulator, with the primary objective of facilitating its capacity to acquire information from human hand demonstrations for the execution of dexterous pick-and-place tasks. Most existing works only focus on the position demonstration without considering the orientations. In this paper, by employing a single depth camera, MediaPipe is applied to generate the three-dimensional coordinates of a human hand, thereby comprehensively recording the hand's motion, encompassing the trajectory of the wrist, orientation of the hand, and the grasp motion. A mean filter is applied during data pre-processing to smooth the raw data. The demonstration is designed to pick up an object at a specific angle, navigate around obstacles in its path and subsequently, deposit it within a sloped container. The robotic system demonstrates its learning capabilities, facilitated by the implementation of Dynamic Movement Primitives, enabling the assimilation of user actions into its trajectories with different start and end poi
