Table of Contents
Fetching ...

Collaborative Assembly Policy Learning of a Sightless Robot

Zeqing Zhang, Weifeng Lu, Lei Yang, Wei Jing, Bowei Tang, Jia Pan

TL;DR

This work tackles a sightless-robot board-insertion task requiring precise co-manipulation with a human operator under sparse rewards. It introduces Policy-Guided PPO (PGPPO), which fuses a human-designed admittance-control policy with reinforcement learning and leverages human demonstrations to bootstrap learning, aided by a simplified human dynamics model and domain randomization. In both simulation and real-world tests, PGPPO outperformed pure admittance control and standard PPO, achieving higher success rates, shorter completion times, and significantly lower interaction forces during insertion. The results demonstrate that admittance-control guidance plus demonstrations can dramatically improve safety and efficiency in physical human-robot collaboration, with potential to generalize to other millimeter-tolerance co-manipulation tasks.

Abstract

This paper explores a physical human-robot collaboration (pHRC) task involving the joint insertion of a board into a frame by a sightless robot and a human operator. While admittance control is commonly used in pHRC tasks, it can be challenging to measure the force/torque applied by the human for accurate human intent estimation, limiting the robot's ability to assist in the collaborative task. Other methods that attempt to solve pHRC tasks using reinforcement learning (RL) are also unsuitable for the board-insertion task due to its safety constraints and sparse rewards. Therefore, we propose a novel RL approach that utilizes a human-designed admittance controller to facilitate more active robot behavior and reduce human effort. Through simulation and real-world experiments, we demonstrate that our approach outperforms admittance control in terms of success rate and task completion time. Additionally, we observed a significant reduction in measured force/torque when using our proposed approach compared to admittance control. The video of the experiments is available at https://youtu.be/va07Gw6YIog.

Collaborative Assembly Policy Learning of a Sightless Robot

TL;DR

This work tackles a sightless-robot board-insertion task requiring precise co-manipulation with a human operator under sparse rewards. It introduces Policy-Guided PPO (PGPPO), which fuses a human-designed admittance-control policy with reinforcement learning and leverages human demonstrations to bootstrap learning, aided by a simplified human dynamics model and domain randomization. In both simulation and real-world tests, PGPPO outperformed pure admittance control and standard PPO, achieving higher success rates, shorter completion times, and significantly lower interaction forces during insertion. The results demonstrate that admittance-control guidance plus demonstrations can dramatically improve safety and efficiency in physical human-robot collaboration, with potential to generalize to other millimeter-tolerance co-manipulation tasks.

Abstract

This paper explores a physical human-robot collaboration (pHRC) task involving the joint insertion of a board into a frame by a sightless robot and a human operator. While admittance control is commonly used in pHRC tasks, it can be challenging to measure the force/torque applied by the human for accurate human intent estimation, limiting the robot's ability to assist in the collaborative task. Other methods that attempt to solve pHRC tasks using reinforcement learning (RL) are also unsuitable for the board-insertion task due to its safety constraints and sparse rewards. Therefore, we propose a novel RL approach that utilizes a human-designed admittance controller to facilitate more active robot behavior and reduce human effort. Through simulation and real-world experiments, we demonstrate that our approach outperforms admittance control in terms of success rate and task completion time. Additionally, we observed a significant reduction in measured force/torque when using our proposed approach compared to admittance control. The video of the experiments is available at https://youtu.be/va07Gw6YIog.

Paper Structure

This paper contains 14 sections, 9 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Installing a single pane of glass into a window frame by two people is a challenging task, even for skilled workers. (b) This paper presents a novel RL approach that employs a specialized admittance controller to facilitate human-robot collaboration for the board-insertion task, solely based on force feedback.
  • Figure 2: The front view when the board is inserted into the frame. $\bm{f}_c$ denotes the interaction force between the board and frame. $\bm{f}_h$ and $\bm{\tau}_h$ are the force and torque applied by the human. Force $\bm{f}_\text{meas}$ and torque $\bm{\tau}_\text{meas}$ are measured by the F/T sensor containing the coupled force/torque. Hence, the admittance control faces ambiguity in interpreting human intention. For example, when the human desires translation in the $Z$ direction by applying pure force in this direction, the torque in the $Y$ direction is generated and measured by the F/T sensor. Under the admittance control, the robot will simultaneously move along the $Z$-direction and rotate about the $Y$ axis, resulting in undesired assistance.
  • Figure 3: Learning curves of different methods in simulation. All PGPPOs with different types of prior knowledge achieve better performance than admittance control. Standard PPO cannot learn a policy to finish the insertion task.
  • Figure 4: The robot end-effector velocity (upper row) and F/T sensor data (bottom) in real-world experiments. There is little difference between PGPPO and admittance control in the approaching phase. But time spent in the inserting phase using PGPPO is much shorter.
  • Figure 5: The process of the board insertion tasks. T1, T2, T3, and T4 are the initial, approaching, inserting, and completed states, respectively. See video for more experiments.
  • ...and 1 more figures