Table of Contents
Fetching ...

Why Look at It at All?: Vision-Free Multifingered Blind Grasping Using Uniaxial Fingertip Force Sensing

Edgar Lee, Junho Choi, Taemin Kim, Changjoo Nam, Seokhwan Jeong

TL;DR

This work tackles vision-free multifingered grasping by relying solely on uniaxial fingertip force feedback and proprioception. It introduces a teacher–student learning pipeline where a privileged-observation RL teacher is trained in simulation and then distilled into a transformer-based student that operates under partial observations during deployment. Real-world experiments across 18 objects achieve $98.3\%$ overall success (ID $100.0\%$, OOD $97.5\%$), demonstrating strong generalization and robustness while reducing sensing complexity. The combination of privileged-teacher distillation, data curation, and a lightweight, vision-free policy offers a scalable path toward low-cost, reliable dexterous manipulation in industrial settings.

Abstract

Grasping under limited sensing remains a fundamental challenge for real-world robotic manipulation, as vision and high-resolution tactile sensors often introduce cost, fragility, and integration complexity. This work demonstrates that reliable multifingered grasping can be achieved under extremely minimal sensing by relying solely on uniaxial fingertip force feedback and joint proprioception, without vision or multi-axis/tactile sensing. To enable such blind grasping, we employ an efficient teacher-student training pipeline in which a reinforcement-learned teacher exploits privileged simulation-only observations to generate demonstrations for distilling a transformer-based student policy operating under partial observation. The student policy is trained to act using only sensing modalities available at real-world deployment. We validate the proposed approach on real hardware across 18 objects, including both in-distribution and out-of-distribution cases, achieving a 98.3~$\%$ overall grasp success rate. These results demonstrate strong robustness and generalization beyond the simulation training distribution, while significantly reducing sensing requirements for real-world grasping systems.

Why Look at It at All?: Vision-Free Multifingered Blind Grasping Using Uniaxial Fingertip Force Sensing

TL;DR

This work tackles vision-free multifingered grasping by relying solely on uniaxial fingertip force feedback and proprioception. It introduces a teacher–student learning pipeline where a privileged-observation RL teacher is trained in simulation and then distilled into a transformer-based student that operates under partial observations during deployment. Real-world experiments across 18 objects achieve overall success (ID , OOD ), demonstrating strong generalization and robustness while reducing sensing complexity. The combination of privileged-teacher distillation, data curation, and a lightweight, vision-free policy offers a scalable path toward low-cost, reliable dexterous manipulation in industrial settings.

Abstract

Grasping under limited sensing remains a fundamental challenge for real-world robotic manipulation, as vision and high-resolution tactile sensors often introduce cost, fragility, and integration complexity. This work demonstrates that reliable multifingered grasping can be achieved under extremely minimal sensing by relying solely on uniaxial fingertip force feedback and joint proprioception, without vision or multi-axis/tactile sensing. To enable such blind grasping, we employ an efficient teacher-student training pipeline in which a reinforcement-learned teacher exploits privileged simulation-only observations to generate demonstrations for distilling a transformer-based student policy operating under partial observation. The student policy is trained to act using only sensing modalities available at real-world deployment. We validate the proposed approach on real hardware across 18 objects, including both in-distribution and out-of-distribution cases, achieving a 98.3~ overall grasp success rate. These results demonstrate strong robustness and generalization beyond the simulation training distribution, while significantly reducing sensing requirements for real-world grasping systems.
Paper Structure (11 sections, 8 equations, 7 figures, 7 tables)

This paper contains 11 sections, 8 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Overview of the proposed teacher-student training pipeline for blind grasping.
  • Figure 2: Set of the 18 geometric objects used in simulation for training the teacher policy. The object set comprises six geometries (two cuboids A/B, one capsule, two cylinders A/B, and one sphere), each modeled in three sizes.
  • Figure 3: Pipeline of the teacher policy trained in simulation with privileged observations. An MLP takes privileged observations and outputs relative joint position changes ($\Delta\theta_t$). Each finger comprises three actuated joints and a uniaxial fingertip force sensor. The reward function includes task, incentive, and penalty terms. An episode terminates when the object distance exceeds $10~\text{cm}$ in the $xy$-plane.
  • Figure 4: Pipeline of the student policy distilled via behavioral cloning for blind grasping (top). A transformer maps proprioceptive states and uniaxial fingertip forces to relative joint position changes ($\Delta\theta_t$), trained on successful teacher demonstrations with an MSE action loss. Simulation rollouts of $\pi_s$ on in-distribution objects (bottom).
  • Figure 5: Objects used in real-world evaluation. The evaluation includes 18 objects: six in-distribution (ID) and twelve out-of-distribution (OOD) cases. The ID objects are 3D-printed replicas of the medium-sized training shapes, while the OOD objects are real-world items with diverse shapes, materials, and surface textures that fit within the gripper workspace.
  • ...and 2 more figures