Table of Contents
Fetching ...

Admittance Visuomotor Policy Learning for General-Purpose Contact-Rich Manipulations

Bo Zhou, Ruixuan Jiao, Yi Li, Xiaogang Yuan, Fang Fang, Shihua Li

TL;DR

An admittance visuomotor policy framework for continuous, general-purpose, and contact-rich manipulations, using a diffusion-based model to plan action trajectories and desired contact forces from multimodal observation that includes contact force, vision and proprioception.

Abstract

Contact force in contact-rich environments is an essential modality for robots to perform general-purpose manipulation tasks, as it provides information to compensate for the deficiencies of visual and proprioceptive data in collision perception, high-precision grasping, and efficient manipulation. In this paper, we propose an admittance visuomotor policy framework for continuous, general-purpose, contact-rich manipulations. During demonstrations, we designed a low-cost, user-friendly teleoperation system with contact interaction, aiming to gather compliant robot demonstrations and accelerate the data collection process. During training and inference, we propose a diffusion-based model to plan action trajectories and desired contact forces from multimodal observation that includes contact force, vision and proprioception. We utilize an admittance controller for compliance action execution. A comparative evaluation with two state-of-the-art methods was conducted on five challenging tasks, each focusing on different action primitives, to demonstrate our framework's generalization capabilities. Results show our framework achieves the highest success rate and exhibits smoother and more efficient contact compared to other methods, the contact force required to complete each tasks was reduced on average by 48.8%, and the success rate was increased on average by 15.3%. Videos are available at https://ryanjiao.github.io/AdmitDiffPolicy/.

Admittance Visuomotor Policy Learning for General-Purpose Contact-Rich Manipulations

TL;DR

An admittance visuomotor policy framework for continuous, general-purpose, and contact-rich manipulations, using a diffusion-based model to plan action trajectories and desired contact forces from multimodal observation that includes contact force, vision and proprioception.

Abstract

Contact force in contact-rich environments is an essential modality for robots to perform general-purpose manipulation tasks, as it provides information to compensate for the deficiencies of visual and proprioceptive data in collision perception, high-precision grasping, and efficient manipulation. In this paper, we propose an admittance visuomotor policy framework for continuous, general-purpose, contact-rich manipulations. During demonstrations, we designed a low-cost, user-friendly teleoperation system with contact interaction, aiming to gather compliant robot demonstrations and accelerate the data collection process. During training and inference, we propose a diffusion-based model to plan action trajectories and desired contact forces from multimodal observation that includes contact force, vision and proprioception. We utilize an admittance controller for compliance action execution. A comparative evaluation with two state-of-the-art methods was conducted on five challenging tasks, each focusing on different action primitives, to demonstrate our framework's generalization capabilities. Results show our framework achieves the highest success rate and exhibits smoother and more efficient contact compared to other methods, the contact force required to complete each tasks was reduced on average by 48.8%, and the success rate was increased on average by 15.3%. Videos are available at https://ryanjiao.github.io/AdmitDiffPolicy/.
Paper Structure (31 sections, 9 equations, 7 figures, 2 tables)

This paper contains 31 sections, 9 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Robot learning multi-stage, contact-rich manipulation skills from human demonstration to open a drawer with AdmitDiff Policy framework. Compared to a policy without the force control module, our completed framework demonstrates a smoother switching between contact stages and significantly reduces the contact force required to complete the task.
  • Figure 2: The teleoperation system, worn and used by an operator, is shown in the figure on the right, with the three functional components highlighted. Specific component models are labeled on the left.
  • Figure 3: AdmitDiff Policy Framework. Left: During inference, the previous two steps' observations are encoded as inputs for noise estimation, while the student model outputs actions for the next 8 time steps, the number $K$ represents the denoising iteration required by the diffuser. The arm's force-position trajectory is used in the admittance controller to compute the desired pose. Middle: The teacher model is trained for 100 denoising steps, then its parameters are frozen to train the student model with a consistency loss for single-step denoising. Right: Data collection, including contact force information, is performed using the teleoperation system designed in this work.
  • Figure 4: Task Process Summary Diagram. We designed five contact-rich manipulation tasks to evaluate the effectiveness of AdmitDiff Policy, and compared them with other imitation learning algorithms. The success rate and contact force optimization results for different tasks are discussed later in this paper.
  • Figure 5: Force control interpolation and tracking performance on different axis in Dragging task.
  • ...and 2 more figures