CATCH-FORM-ACTer: Compliance-Aware Tactile Control and Hybrid Deformation Regulation-Based Action Transformer for Viscoelastic Object Manipulation
Hongjun Ma, Weichang Li, Jingwei Zhang, Shenlai He, Xiaoyan Deng
TL;DR
The paper tackles the problem of contact-rich manipulation of viscoelastic objects with rigid robots, where dynamic parameter mismatches and spatiotemporal force-deformation coupling hinder dexterity.It introduces CATCH-FORM-ACTer, a phase-aware framework that combines Learning from Demonstrations with an Action Chunking Transformer to plan long-horizon actions while dynamically adjusting stiffness, damping, and diffusion during different manipulation phases, via a CVAE-augmented policy that ingests multi-modal data and phase context.A physics-grounded CATCH-FORM-3D controller provides an interpretable, tunable, and stable inner-outer admittance loop based on a unified 3D Kelvin–Voigt–Maxwell viscoelastic model and a PDE-driven observer for real-time material-property estimation.Experimental validation on single-arm and bimanual tasks shows 10–20% higher success rates than prior ACT variants, leveraging spatial force and deformation fields to guide phase-dependent compliance changes and achieving sub-millimeter control accuracy.Overall, the work offers a practical route to human-like force-deformation modulation in complex viscoelastic interactions, enabling safer, more reliable manipulation in industrial, medical, and household settings.
Abstract
Automating contact-rich manipulation of viscoelastic objects with rigid robots faces challenges including dynamic parameter mismatches, unstable contact oscillations, and spatiotemporal force-deformation coupling. In our prior work, a Compliance-Aware Tactile Control and Hybrid Deformation Regulation (CATCH-FORM-3D) strategy fulfills robust and effective manipulations of 3D viscoelastic objects, which combines a contact force-driven admittance outer loop and a PDE-stabilized inner loop, achieving sub-millimeter surface deformation accuracy. However, this strategy requires fine-tuning of object-specific parameters and task-specific calibrations, to bridge this gap, a CATCH-FORM-ACTer is proposed, by enhancing CATCH-FORM-3D with a framework of Action Chunking with Transformer (ACT). An intuitive teleoperation system performs Learning from Demonstration (LfD) to build up a long-horizon sensing, decision-making and execution sequences. Unlike conventional ACT methods focused solely on trajectory planning, our approach dynamically adjusts stiffness, damping, and diffusion parameters in real time during multi-phase manipulations, effectively imitating human-like force-deformation modulation. Experiments on single arm/bimanual robots in three tasks show better force fields patterns and thus 10%-20% higher success rates versus conventional methods, enabling precise, safe interactions for industrial, medical or household scenarios.
