Why Look at It at All?: Vision-Free Multifingered Blind Grasping Using Uniaxial Fingertip Force Sensing
Edgar Lee, Junho Choi, Taemin Kim, Changjoo Nam, Seokhwan Jeong
TL;DR
This work tackles vision-free multifingered grasping by relying solely on uniaxial fingertip force feedback and proprioception. It introduces a teacher–student learning pipeline where a privileged-observation RL teacher is trained in simulation and then distilled into a transformer-based student that operates under partial observations during deployment. Real-world experiments across 18 objects achieve $98.3\%$ overall success (ID $100.0\%$, OOD $97.5\%$), demonstrating strong generalization and robustness while reducing sensing complexity. The combination of privileged-teacher distillation, data curation, and a lightweight, vision-free policy offers a scalable path toward low-cost, reliable dexterous manipulation in industrial settings.
Abstract
Grasping under limited sensing remains a fundamental challenge for real-world robotic manipulation, as vision and high-resolution tactile sensors often introduce cost, fragility, and integration complexity. This work demonstrates that reliable multifingered grasping can be achieved under extremely minimal sensing by relying solely on uniaxial fingertip force feedback and joint proprioception, without vision or multi-axis/tactile sensing. To enable such blind grasping, we employ an efficient teacher-student training pipeline in which a reinforcement-learned teacher exploits privileged simulation-only observations to generate demonstrations for distilling a transformer-based student policy operating under partial observation. The student policy is trained to act using only sensing modalities available at real-world deployment. We validate the proposed approach on real hardware across 18 objects, including both in-distribution and out-of-distribution cases, achieving a 98.3~$\%$ overall grasp success rate. These results demonstrate strong robustness and generalization beyond the simulation training distribution, while significantly reducing sensing requirements for real-world grasping systems.
