HAND Me the Data: Fast Robot Adaptation via Hand Path Retrieval
Matthew Hong, Anthony Liang, Kevin Kim, Harshitha Rajaprakash, Jesse Thomason, Erdem Bıyık, Jesse Zhang
TL;DR
This work tackles rapid robot adaptation to new tasks using only a single human hand demonstration and a task-agnostic robot play dataset. It introduces HAND, a two-stage retrieval framework that uses 2D hand-paths and a visual filter to retrieve relevant robot sub-trajectories, followed by parameter-efficient policy fine-tuning with LoRA adapters. The approach achieves real-time learning in under four minutes on real robots and outperforms retrieval baselines by substantial margins, even with hand demonstrations from unseen scenes and camera angles. The findings highlight the practicality of hand-path retrieval for scalable, data-efficient robot learning in human-centric settings, with implications for non-expert users and rapid task deployment.
Abstract
We hand the community HAND, a simple and time-efficient method for teaching robots new manipulation tasks through human hand demonstrations. Instead of relying on task-specific robot demonstrations collected via teleoperation, HAND uses easy-to-provide hand demonstrations to retrieve relevant behaviors from task-agnostic robot play data. Using a visual tracking pipeline, HAND extracts the motion of the human hand from the hand demonstration and retrieves robot sub-trajectories in two stages: first filtering by visual similarity, then retrieving trajectories with similar behaviors to the hand. Fine-tuning a policy on the retrieved data enables real-time learning of tasks in under four minutes, without requiring calibrated cameras or detailed hand pose estimation. Experiments also show that HAND outperforms retrieval baselines by over 2x in average task success rates on real robots. Videos can be found at our project website: https://liralab.usc.edu/handretrieval/.
