VibeCheck: Using Active Acoustic Tactile Sensing for Contact-Rich Manipulation
Kaidi Zhang, Do-Gon Kim, Eric T. Chang, Hua-Hsuan Liang, Zhanpeng He, Kathryn Lampo, Philippe Wu, Ioannis Kymissis, Matei Ciocarlie
TL;DR
VibeCheck tackles the challenge of contact-rich manipulation by employing active acoustic sensing with two piezoelectric fingers to transmit signals through held objects and infer material properties, geometry, and extrinsic contacts. The system extracts resonant-frequency features via FFT and kernel PCA, training ML classifiers for object type, grasp position, internal pose, and contact state, and uses these signals to drive a peg-insertion policy learned in simulation that relies solely on acoustic feedback. In simulation, the policy reaches 95% success, and on a UR5 robot it shows strong performance for in-distribution starts and meaningful generalization to out-of-distribution poses (about 60% success). Overall, the work demonstrates that active acoustic sensing can function as a robust, standalone modality for long-horizon manipulation and can complement other tactile sensing approaches in occluded or cluttered environments.
Abstract
The acoustic response of an object can reveal a lot about its global state, for example its material properties or the extrinsic contacts it is making with the world. In this work, we build an active acoustic sensing gripper equipped with two piezoelectric fingers: one for generating signals, the other for receiving them. By sending an acoustic vibration from one finger to the other through an object, we gain insight into an object's acoustic properties and contact state. We use this system to classify objects, estimate grasping position, estimate poses of internal structures, and classify the types of extrinsic contacts an object is making with the environment. Using our contact type classification model, we tackle a standard long-horizon manipulation problem: peg insertion. We use a simple simulated transition model based on the performance of our sensor to train an imitation learning policy that is robust to imperfect predictions from the classifier. We finally demonstrate the policy on a UR5 robot with active acoustic sensing as the only feedback.
