Table of Contents
Fetching ...

See to Touch: Learning Tactile Dexterity through Visual Incentives

Irmak Guzey, Yinlong Dai, Ben Evans, Soumith Chintala, Lerrel Pinto

TL;DR

TAVI introduces a tactile dexterity framework that uses vision-based rewards to train tactile policies online. By learning visual representations with a contrastive objective and applying an OT-based reward from a single human demonstration, it guides a residual policy on a multi-finger hand to perform precision tasks. The approach achieves state-of-the-art success across six dexterous tasks and demonstrates strong generalization and robustness characteristics, while also revealing the importance of visual rewards over tactile cues in reward shaping. This work underscores the value of integrating vision-driven incentives with tactile sensing to achieve human-like dexterity in robotic hands.

Abstract

Equipping multi-fingered robots with tactile sensing is crucial for achieving the precise, contact-rich, and dexterous manipulation that humans excel at. However, relying solely on tactile sensing fails to provide adequate cues for reasoning about objects' spatial configurations, limiting the ability to correct errors and adapt to changing situations. In this paper, we present Tactile Adaptation from Visual Incentives (TAVI), a new framework that enhances tactile-based dexterity by optimizing dexterous policies using vision-based rewards. First, we use a contrastive-based objective to learn visual representations. Next, we construct a reward function using these visual representations through optimal-transport based matching on one human demonstration. Finally, we use online reinforcement learning on our robot to optimize tactile-based policies that maximize the visual reward. On six challenging tasks, such as peg pick-and-place, unstacking bowls, and flipping slender objects, TAVI achieves a success rate of 73% using our four-fingered Allegro robot hand. The increase in performance is 108% higher than policies using tactile and vision-based rewards and 135% higher than policies without tactile observational input. Robot videos are best viewed on our project website: https://see-to-touch.github.io/.

See to Touch: Learning Tactile Dexterity through Visual Incentives

TL;DR

TAVI introduces a tactile dexterity framework that uses vision-based rewards to train tactile policies online. By learning visual representations with a contrastive objective and applying an OT-based reward from a single human demonstration, it guides a residual policy on a multi-finger hand to perform precision tasks. The approach achieves state-of-the-art success across six dexterous tasks and demonstrates strong generalization and robustness characteristics, while also revealing the importance of visual rewards over tactile cues in reward shaping. This work underscores the value of integrating vision-driven incentives with tactile sensing to achieve human-like dexterity in robotic hands.

Abstract

Equipping multi-fingered robots with tactile sensing is crucial for achieving the precise, contact-rich, and dexterous manipulation that humans excel at. However, relying solely on tactile sensing fails to provide adequate cues for reasoning about objects' spatial configurations, limiting the ability to correct errors and adapt to changing situations. In this paper, we present Tactile Adaptation from Visual Incentives (TAVI), a new framework that enhances tactile-based dexterity by optimizing dexterous policies using vision-based rewards. First, we use a contrastive-based objective to learn visual representations. Next, we construct a reward function using these visual representations through optimal-transport based matching on one human demonstration. Finally, we use online reinforcement learning on our robot to optimize tactile-based policies that maximize the visual reward. On six challenging tasks, such as peg pick-and-place, unstacking bowls, and flipping slender objects, TAVI achieves a success rate of 73% using our four-fingered Allegro robot hand. The increase in performance is 108% higher than policies using tactile and vision-based rewards and 135% higher than policies without tactile observational input. Robot videos are best viewed on our project website: https://see-to-touch.github.io/.
Paper Structure (36 sections, 3 equations, 11 figures, 5 tables)

This paper contains 36 sections, 3 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: TAVI learns dexterous policies through online learning. Both tactile and image is used to retrieve action while only image is used for reward calculation.
  • Figure 2: Rollouts of trained policies from TAVI on six tasks. Videos are best viewed on our website https://see-to-touch.github.io/.
  • Figure 3: We show success rates of TAVI on a variety of objects not seen during demonstration collection.
  • Figure 4: We show an illustration of our long-horizon policy sequencing. TAVI shows robustness when different tasks are sequenced and successfully applies the learned policies separately.
  • Figure 5: Cost matrix $C_{ij}$ for a failed and a successful trajectory. Darker colors represent low costs and lighter colors represent higher costs. Note the large area of darker colors at the middle of the unsuccessful rollout and the larger area of darker colors at the end of the successful rollout. When OT matching is applied these low cost areas compensate for each other giving an equal reward of -11 for both of these demonstrations. Also note the similarity of the hand pose between the unsucessful and the expert demonstration which explains the similarity of the representations.
  • ...and 6 more figures