Table of Contents
Fetching ...

Imitation learning for sim-to-real transfer of robotic cutting policies based on residual Gaussian process disturbance force model

Jamie Hathaway, Rustam Stolkin, Alireza Rastegarpanah

TL;DR

A hybrid approach for sim-to-real transfer based on a milling process force model and residual Gaussian process (GP) force model, learned from either single or multiple real-world cutting force examples is proposed.

Abstract

Robotic cutting, or milling, plays a significant role in applications such as disassembly, decommissioning, and demolition. Planning and control of cutting in real-world scenarios in uncertain environments is a complex task, with the potential to benefit from simulated training environments. This letter focuses on sim-to-real transfer for robotic cutting policies, addressing the need for effective policy transfer from simulation to practical implementation. We extend our previous domain generalisation approach to learning cutting tasks based on a mechanistic model-based simulation framework, by proposing a hybrid approach for sim-to-real transfer based on a milling process force model and residual Gaussian process (GP) force model, learned from either single or multiple real-world cutting force examples. We demonstrate successful sim-to-real transfer of a robotic cutting policy without the need for fine-tuning on the real robot setup. The proposed approach autonomously adapts to materials with differing structural and mechanical properties. Furthermore, we demonstrate the proposed method outperforms fine-tuning or re-training alone.

Imitation learning for sim-to-real transfer of robotic cutting policies based on residual Gaussian process disturbance force model

TL;DR

A hybrid approach for sim-to-real transfer based on a milling process force model and residual Gaussian process (GP) force model, learned from either single or multiple real-world cutting force examples is proposed.

Abstract

Robotic cutting, or milling, plays a significant role in applications such as disassembly, decommissioning, and demolition. Planning and control of cutting in real-world scenarios in uncertain environments is a complex task, with the potential to benefit from simulated training environments. This letter focuses on sim-to-real transfer for robotic cutting policies, addressing the need for effective policy transfer from simulation to practical implementation. We extend our previous domain generalisation approach to learning cutting tasks based on a mechanistic model-based simulation framework, by proposing a hybrid approach for sim-to-real transfer based on a milling process force model and residual Gaussian process (GP) force model, learned from either single or multiple real-world cutting force examples. We demonstrate successful sim-to-real transfer of a robotic cutting policy without the need for fine-tuning on the real robot setup. The proposed approach autonomously adapts to materials with differing structural and mechanical properties. Furthermore, we demonstrate the proposed method outperforms fine-tuning or re-training alone.
Paper Structure (10 sections, 15 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 10 sections, 15 equations, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Overview of the proposed framework, consisting of three stages. In the first stage, a model of the cutting mechanics (source domain) is employed to train an expert policy. Secondly, cutting process force data are collected offline on a target domain (real world), which is used to train a corrective model of disturbances during the real world cutting process. Finally, imitation learning on a surrogate target domain is employed to align the marginal action distributions of expert and learner policies to generate a new target domain policy.
  • Figure 2: Overview of the experimental setup used for real world cutting experiments, with tool reference frame $\mathcal{M}$, end-effector $\mathcal{EE}$ and world frame $\mathcal{W}$ shown.
  • Figure 3: Plot of measured disturbance force in feed (Y) direction from dataset of examples after temporal alignment. The Gaussian process model fit is shown; the shaded areas show 1-$\sigma$ and 2-$\sigma$ standard deviations from the mean respectively. The dashed line shows the transition from training data (left) to extrapolation (right).
  • Figure 4: Comparison of policy actions and states between source domain expert in source and surrogate target domains (Expert, Expert GP), and surrogate target domain policies using fine-tuning, behavioural cloning (BC) and DAgger imitation learning strategies. States include the path error transverse $e_{x}$ and normal $e_{z}$ to the planned path and forces in the feed $F_{y}$ and normal direction $F_{z}$. Forces are shown with a 50-point (1s) moving average filter.
  • Figure 5: Comparison of training curves between base environment and environment with GP residual force model showing average rewards with 95% confidence intervals.
  • ...and 4 more figures