Residual Rotation Correction using Tactile Equivariance
Yizhe Zhu, Zhang Ye, Boce Hu, Haibo Zhao, Yu Qi, Dian Wang, Robert Platt
TL;DR
This work tackles the data-efficiency challenge in visuotactile robotic manipulation by exploiting the planar rotational symmetry of in-hand object orientation. It introduces EquiTac, which reconstructs surface normal maps from tactile images and employs an $\mathrm{SO}(2)$-equivariant network to predict a yaw target, providing a real-time angular residual that corrects a base visuomotor policy without extra demonstrations. The approach combines a flow-matching base policy with a lightweight, symmetry-aware tactile residual, achieving strong zero-shot generalization to unseen orientations and superior robustness under perturbations in real-world robot experiments. By delivering a lightweight, symmetry-aware module that explicitly encodes tactile equivariance, EquiTac improves reliability in contact-rich manipulation while substantially reducing data and online training requirements.
Abstract
Visuotactile policy learning augments vision-only policies with tactile input, facilitating contact-rich manipulation. However, the high cost of tactile data collection makes sample efficiency the key requirement for developing visuotactile policies. We present EquiTac, a framework that exploits the inherent SO(2) symmetry of in-hand object rotation to improve sample efficiency and generalization for visuotactile policy learning. EquiTac first reconstructs surface normals from raw RGB inputs of vision-based tactile sensors, so rotations of the normal vector field correspond to in-hand object rotations. An SO(2)-equivariant network then predicts a residual rotation action that augments a base visuomotor policy at test time, enabling real-time rotation correction without additional reorientation demonstrations. On a real robot, EquiTac accurately achieves robust zero-shot generalization to unseen in-hand orientations with very few training samples, where baselines fail even with more training data. To our knowledge, this is the first tactile learning method to explicitly encode tactile equivariance for policy learning, yielding a lightweight, symmetry-aware module that improves reliability in contact-rich tasks.
