Table of Contents
Fetching ...

NormalFlow: Fast, Robust, and Accurate Contact-based Object 6DoF Pose Tracking with Vision-based Tactile Sensors

Hung-Jui Huang, Michael Kaess, Wenzhen Yuan

TL;DR

NormalFlow introduces a fast, robust approach for vision-based tactile tracking by directly aligning surface normal maps rather than relying on potentially noisy height-derived point clouds. The method employs a Gauss-Newton optimization over the normal maps, with an inverse-compositional formulation and random pixel subsampling to achieve real-time performance on CPU. It enables accurate long-horizon tracking, including 360° bead rotations with low rotational error, and extends to tactile-based 3D reconstruction with loop closure. The work demonstrates wide generalization across objects, sensor types, and resolutions, highlighting the potential of tactile-based pose estimation for high-precision perception and manipulation tasks.

Abstract

Tactile sensing is crucial for robots aiming to achieve human-level dexterity. Among tactile-dependent skills, tactile-based object tracking serves as the cornerstone for many tasks, including manipulation, in-hand manipulation, and 3D reconstruction. In this work, we introduce NormalFlow, a fast, robust, and real-time tactile-based 6DoF tracking algorithm. Leveraging the precise surface normal estimation of vision-based tactile sensors, NormalFlow determines object movements by minimizing discrepancies between the tactile-derived surface normals. Our results show that NormalFlow consistently outperforms competitive baselines and can track low-texture objects like table surfaces. For long-horizon tracking, we demonstrate when rolling the sensor around a bead for 360 degrees, NormalFlow maintains a rotational tracking error of 2.5 degrees. Additionally, we present state-of-the-art tactile-based 3D reconstruction results, showcasing the high accuracy of NormalFlow. We believe NormalFlow unlocks new possibilities for high-precision perception and manipulation tasks that involve interacting with objects using hands. The video demo, code, and dataset are available on our website: https://joehjhuang.github.io/normalflow.

NormalFlow: Fast, Robust, and Accurate Contact-based Object 6DoF Pose Tracking with Vision-based Tactile Sensors

TL;DR

NormalFlow introduces a fast, robust approach for vision-based tactile tracking by directly aligning surface normal maps rather than relying on potentially noisy height-derived point clouds. The method employs a Gauss-Newton optimization over the normal maps, with an inverse-compositional formulation and random pixel subsampling to achieve real-time performance on CPU. It enables accurate long-horizon tracking, including 360° bead rotations with low rotational error, and extends to tactile-based 3D reconstruction with loop closure. The work demonstrates wide generalization across objects, sensor types, and resolutions, highlighting the potential of tactile-based pose estimation for high-precision perception and manipulation tasks.

Abstract

Tactile sensing is crucial for robots aiming to achieve human-level dexterity. Among tactile-dependent skills, tactile-based object tracking serves as the cornerstone for many tasks, including manipulation, in-hand manipulation, and 3D reconstruction. In this work, we introduce NormalFlow, a fast, robust, and real-time tactile-based 6DoF tracking algorithm. Leveraging the precise surface normal estimation of vision-based tactile sensors, NormalFlow determines object movements by minimizing discrepancies between the tactile-derived surface normals. Our results show that NormalFlow consistently outperforms competitive baselines and can track low-texture objects like table surfaces. For long-horizon tracking, we demonstrate when rolling the sensor around a bead for 360 degrees, NormalFlow maintains a rotational tracking error of 2.5 degrees. Additionally, we present state-of-the-art tactile-based 3D reconstruction results, showcasing the high accuracy of NormalFlow. We believe NormalFlow unlocks new possibilities for high-precision perception and manipulation tasks that involve interacting with objects using hands. The video demo, code, and dataset are available on our website: https://joehjhuang.github.io/normalflow.

Paper Structure

This paper contains 22 sections, 7 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: NormalFlow performs fast, accurate, and robust 6DoF object tracking based on only touch sensing. (a) Accurate tracking of a wide variety of objects, including a wrench, a rock, and even low-texture object like an egg. (b) Applying NormalFlow to tactile-based 3D reconstruction of a $12$mm wide bead highlights NormalFlow's high accuracy.
  • Figure 2: Given two tactile images before and after object movement, we derive the surface normal maps. NormalFlow determines the object transformations by minimizing discrepancies between the surface normal maps.
  • Figure 3: The data collection setup, with the object clamped on the table and the GelSight sensor tracked by MoCap.
  • Figure 4: Initial contact locations (manually labeled) for the seven trials per object in the tracking experiment.
  • Figure 5: Tracking results for the 12 objects. For each object: [left] the object (scale not shown for common objects) and a sample tactile image (Seed's image slightly blurred to avoid visualization discomfort); [right]the 6DoF tracking MAE: left y-axis shows absolute error, right y-axis shows percentage error relative to object movement range.
  • ...and 8 more figures