Table of Contents
Fetching ...

Composing Option Sequences by Adaptation: Initial Results

Charles A. Meehan, Paul Rademacher, Mark Roberts, Laura M. Hiatt

TL;DR

A framework to determine whether sequences will succeed a priori, and three approaches that adapt options to sequence successfully if they will not, are proposed and results show that the framework and adaptation methods have promise in adapting options to work in novel sequences.

Abstract

Robot manipulation in real-world settings often requires adapting the robot's behavior to the current situation, such as by changing the sequences in which policies execute to achieve the desired task. Problematically, however, we show that composing a novel sequence of five deep RL options to perform a pick-and-place task is unlikely to successfully complete, even if their initiation and termination conditions align. We propose a framework to determine whether sequences will succeed a priori, and examine three approaches that adapt options to sequence successfully if they will not. Crucially, our adaptation methods consider the actual subset of points that the option is trained from or where it ends: (1) trains the second option to start where the first ends; (2) trains the first option to reach the centroid of where the second starts; and (3) trains the first option to reach the median of where the second starts. Our results show that our framework and adaptation methods have promise in adapting options to work in novel sequences.

Composing Option Sequences by Adaptation: Initial Results

TL;DR

A framework to determine whether sequences will succeed a priori, and three approaches that adapt options to sequence successfully if they will not, are proposed and results show that the framework and adaptation methods have promise in adapting options to work in novel sequences.

Abstract

Robot manipulation in real-world settings often requires adapting the robot's behavior to the current situation, such as by changing the sequences in which policies execute to achieve the desired task. Problematically, however, we show that composing a novel sequence of five deep RL options to perform a pick-and-place task is unlikely to successfully complete, even if their initiation and termination conditions align. We propose a framework to determine whether sequences will succeed a priori, and examine three approaches that adapt options to sequence successfully if they will not. Crucially, our adaptation methods consider the actual subset of points that the option is trained from or where it ends: (1) trains the second option to start where the first ends; (2) trains the first option to reach the centroid of where the second starts; and (3) trains the first option to reach the median of where the second starts. Our results show that our framework and adaptation methods have promise in adapting options to work in novel sequences.
Paper Structure (13 sections, 4 equations, 11 figures)

This paper contains 13 sections, 4 equations, 11 figures.

Figures (11)

  • Figure 1: Sequence of options for a pick-and-place task.
  • Figure 2: Kinova Gen3 robotic arm in a simplified pick-and-place environment built in robosuite.
  • Figure 3: Conceptual illustration of connected and composable options, $o_i$ and $o_j$. (a.) Connected options. (b.) Composable options.
  • Figure 4: Diagram of the adaptation of option, $o_{k}$, using the Origin Method.
  • Figure 5: Diagram of the adaptation of option, $o_{j}$, using the Result Methods.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Definition 3.1: Composable Options