Table of Contents
Fetching ...

TacLoc: Global Tactile Localization on Objects from a Registration Perspective

Zirui Zhang, Boyang Zhang, Fumin Zhang, Huan Yin

TL;DR

A novel tactile localization framework that formulates the problem as a one-shot point cloud registration task, TacLoc introduces a graph-theoretic partial-to-full registration method, leveraging dense point clouds and surface normals from tactile sensing for efficient and accurate pose estimation.

Abstract

Pose estimation is essential for robotic manipulation, particularly when visual perception is occluded during gripper-object interactions. Existing tactile-based methods generally rely on tactile simulation or pre-trained models, which limits their generalizability and efficiency. In this study, we propose TacLoc, a novel tactile localization framework that formulates the problem as a one-shot point cloud registration task. TacLoc introduces a graph-theoretic partial-to-full registration method, leveraging dense point clouds and surface normals from tactile sensing for efficient and accurate pose estimation. Without requiring rendered data or pre-trained models, TacLoc achieves improved performance through normal-guided graph pruning and a hypothesis-and-verification pipeline. TacLoc is evaluated extensively on the YCB dataset. We further demonstrate TacLoc on real-world objects across two different visual-tactile sensors.

TacLoc: Global Tactile Localization on Objects from a Registration Perspective

TL;DR

A novel tactile localization framework that formulates the problem as a one-shot point cloud registration task, TacLoc introduces a graph-theoretic partial-to-full registration method, leveraging dense point clouds and surface normals from tactile sensing for efficient and accurate pose estimation.

Abstract

Pose estimation is essential for robotic manipulation, particularly when visual perception is occluded during gripper-object interactions. Existing tactile-based methods generally rely on tactile simulation or pre-trained models, which limits their generalizability and efficiency. In this study, we propose TacLoc, a novel tactile localization framework that formulates the problem as a one-shot point cloud registration task. TacLoc introduces a graph-theoretic partial-to-full registration method, leveraging dense point clouds and surface normals from tactile sensing for efficient and accurate pose estimation. Without requiring rendered data or pre-trained models, TacLoc achieves improved performance through normal-guided graph pruning and a hypothesis-and-verification pipeline. TacLoc is evaluated extensively on the YCB dataset. We further demonstrate TacLoc on real-world objects across two different visual-tactile sensors.
Paper Structure (22 sections, 10 equations, 12 figures, 2 tables)

This paper contains 22 sections, 10 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Real-world demonstration using TacLoc. A GelSight Mini sensor touches a textureless 3D-printed object (left), producing a tactile image that is converted into a dense point cloud and registered to a prior CAD model (right). The zoomed view shows the tactile point cloud (blue) aligned on the CAD model (red).
  • Figure 2: Overview of the TacLoc pipeline for one-shot global tactile localization. The process starts with feature extraction to establish initial correspondences. A compatibility graph is constructed based on distance and normal consistency, and maximal cliques are identified for pose hypothesis generation. Finally, a hypothesis-and-verification approach is applied to refine the pose estimation, achieving partial-to-full tactile localization. It is worth noting that a CAD model is indispensable as a prior map and end-effector poses are required to build submaps during sliding touch.
  • Figure 3: For each frame captured by the tactile sensor, we first convert it into a point cloud with normals. Then we construct a submap representing either a sliding touch or multi-time retouch by integrating end-effector poses. This submap is then processed using our front-end preprocessing pipeline. Best viewed zoomed in and in color.
  • Figure 4: Visualization of candidate extraction pipeline with a minimal example (best viewed in color). (a) Three points in the source point cloud (blue) and three points in the target point cloud (orange), connected by six initial correspondences (gray lines). (b) The correspondences form an undirected graph $G$, with nodes $V = \{\xi_{1,1}, \xi_{1,2}, \xi_{2,1}, \xi_{2,2}, \xi_{2,3}, \xi_{3,3}\}$ and edges $E = \{\{\xi_{1,1}, \xi_{2,2}\}, \{\xi_{1,1}, \xi_{3,3}\}, \{\xi_{1,2}, \xi_{2,1}\}, \{\xi_{2,2}, \xi_{3,3}\}$, constructed via pairwise consistency checks. (c) The first maximal clique $C_1 = \{\xi_{1,1}, \xi_{2,2}, \xi_{3,3}\}$. (d) The second maximal clique $C_2 = \{\xi_{1,2}, \xi_{2,1}\}$.
  • Figure 5: Selected samples from the YCB-Reg benchmark. From left to right are ten objects from the YCB dataset ycb. Gray semi-transparent objects represent the target models, while green point clouds correspond to tactile-based point clouds using the pre-processing approach in Section \ref{['sec:preprocessing']}. Zoomed views of these sliding samples are also presented at the bottom.
  • ...and 7 more figures