Learning Stable Robot Grasping with Transformer-based Tactile Control Policies
En Yen Puang, Zechen Li, Chee Meng Chew, Shan Luo, Yan Wu
TL;DR
This work reframes stable grasping by allowing the object’s center of gravity to shift during an episode and by introducing a gripping-force control dimension. It employs a model-free, end-to-end Transformer-based reinforcement learning policy that processes spatiotemporal tactile maps to output continuous changes in grasp location and grip force, trained with SAC. A multi-objective reward balances rotational stability and slippage in the early stages with a focus on minimizing excess grip force at terminal states, controlled by a trade-off parameter $\alpha$. The method achieves near-perfect success in simulation and exhibits zero-shot sim-to-real transfer on real hardware with varied load configurations, offering practical robustness and insights into the trade-offs between minimizing attempts and optimizing grip force. The results highlight Transformer architectures as advantageous for handling irregular temporal tactile data compared to CNN baselines, motivating future tactile-realistic control research.
Abstract
Measuring grasp stability is an important skill for dexterous robot manipulation tasks, which can be inferred from haptic information with a tactile sensor. Control policies have to detect rotational displacement and slippage from tactile feedback, and determine a re-grasp strategy in term of location and force. Classic stable grasp task only trains control policies to solve for re-grasp location with objects of fixed center of gravity. In this work, we propose a revamped version of stable grasp task that optimises both re-grasp location and gripping force for objects with unknown and moving center of gravity. We tackle this task with a model-free, end-to-end Transformer-based reinforcement learning framework. We show that our approach is able to solve both objectives after training in both simulation and in a real-world setup with zero-shot transfer. We also provide performance analysis of different models to understand the dynamics of optimizing two opposing objectives.
