Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning

Jingshuai Liu; Alain Andres; Yonghang Jiang; Xichun Luo; Wenmiao Shu; Sotirios A. Tsaftaris

Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning

Jingshuai Liu, Alain Andres, Yonghang Jiang, Xichun Luo, Wenmiao Shu, Sotirios A. Tsaftaris

TL;DR

This work presents an actor-critic RL framework, termed AC-SSIL, that adopts a self-supervised IL method, dubbed SSIL, to effectively incorporate demonstrated states into RL paradigms by retrieving from the nearest neighbours of the query state and utilizing the bootstrapping of actor networks.

Abstract

Surgical robot task automation has recently attracted great attention due to its potential to benefit both surgeons and patients. Reinforcement learning (RL) based approaches have demonstrated promising ability to provide solutions to automated surgical manipulations on various tasks. To address the exploration challenge, expert demonstrations can be utilized to enhance the learning efficiency via imitation learning (IL) approaches. However, the successes of such methods normally rely on both states and action labels. Unfortunately action labels can be hard to capture or their manual annotation is prohibitively expensive owing to the requirement for expert knowledge. It therefore remains an appealing and open problem to leverage expert demonstrations composed of pure states in RL. In this work, we present an actor-critic RL framework, termed AC-SSIL, to overcome this challenge of learning with state-only demonstrations collected by following an unknown expert policy. It adopts a self-supervised IL method, dubbed SSIL, to effectively incorporate demonstrated states into RL paradigms by retrieving from demonstrates the nearest neighbours of the query state and utilizing the bootstrapping of actor networks. We showcase through experiments on an open-source surgical simulation platform that our method delivers remarkable improvements over the RL baseline and exhibits comparable performance against action based IL methods, which implies the efficacy and potential of our method for expert demonstration-guided learning scenarios.

Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning

TL;DR

Abstract

Paper Structure (21 sections, 8 equations, 7 figures, 1 table)

This paper contains 21 sections, 8 equations, 7 figures, 1 table.

Introduction
Related Works
Reinforcement Learning for Surgical Assistance
Imitation Learning with Expert Demonstrations
Learning from Observations
Methodology
Problem Formulation: Learning with State-Only Demonstrations
Deep Reinforcement Learning Fundamentals
Self-Supervised Imitation Learning (SSIL)
$K$-Nearest Neighbour Matching
Pseudo Action Labeling
Actor-Critic SSIL Training Framework (AC-SSIL)
Behaviour Regularized Actor Objective
Behaviour Regularized Critic Objective
Experiments
...and 6 more sections

Figures (7)

Figure 1: Our method learns to perform automated surgical tasks using reinforcement learning (RL) algorithms where an agent makes actions based on states in a simulated platform, receives feedback, and improves over time. We propose a novel self-supervised imitation learning approach to leverage state-only demonstrations, i.e. pure states without action information, collected by an unknown expert policy, in order to enhance RL exploration.
Figure 2: Illustration of actor-critic SSIL training framework (AC-SSIL). The replay buffer $D_\pi$ and the expert buffer $D_E$ are collected from the surgical platform and used to optimize the policy agent. Since no actions are available in $D_E$, the devised self-supervised imitation learning (SSIL) is adopted to provide guidance on model training in a reinforcement learning paradigm.
Figure 3: Comparison between behaviour cloning (BC) and self-supervised imitation learning (SSIL). BC regularizes agent training by minimizing the distance between the policy action and the demonstrated action, where action labels are necessary. The proposed SSIL retrieves from demonstrated states the nearest neighbours of the query state $s_t$ and produces pseudo action labels for exploration guidance, overcoming the need for action annotations.
Figure 4: Surgical manipulation tasks automatically performed by RL agents: (a) NeedlePick, (b) GauzeRetrieve, (c) PegTransfer, and (d) NeedleRegrasp.
Figure 5: Evolution of return over training on NeedleRegrasp task. Our method AC-SSIL is compared against (a) methods in comparison and (b) methods in analysis on SSIL.
...and 2 more figures

Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning

TL;DR

Abstract

Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)