Table of Contents
Fetching ...

Learning Spatial Bimanual Action Models Based on Affordance Regions and Human Demonstrations

Björn S. Plonka, Christian Dreher, Andre Meixner, Rainer Kartmann, Tamim Asfour

TL;DR

A novel approach to learn changes of affordance constraints in human demonstration to construct spatial bimanual action models representing object interactions and forms an optimization problem to determine optimal object configurations across multiple execution keypoints while taking into account the initial scene, the learned affordance constraints, and the robot's kinematics.

Abstract

In this paper, we present a novel approach for learning bimanual manipulation actions from human demonstration by extracting spatial constraints between affordance regions, termed affordance constraints, of the objects involved. Affordance regions are defined as object parts that provide interaction possibilities to an agent. For example, the bottom of a bottle affords the object to be placed on a surface, while its spout affords the contained liquid to be poured. We propose a novel approach to learn changes of affordance constraints in human demonstration to construct spatial bimanual action models representing object interactions. To exploit the information encoded in these spatial bimanual action models, we formulate an optimization problem to determine optimal object configurations across multiple execution keypoints while taking into account the initial scene, the learned affordance constraints, and the robot's kinematics. We evaluate the approach in simulation with two example tasks (pouring drinks and rolling dough) and compare three different definitions of affordance constraints: (i) component-wise distances between affordance regions in Cartesian space, (ii) component-wise distances between affordance regions in cylindrical space, and (iii) degrees of satisfaction of manually defined symbolic spatial affordance constraints.

Learning Spatial Bimanual Action Models Based on Affordance Regions and Human Demonstrations

TL;DR

A novel approach to learn changes of affordance constraints in human demonstration to construct spatial bimanual action models representing object interactions and forms an optimization problem to determine optimal object configurations across multiple execution keypoints while taking into account the initial scene, the learned affordance constraints, and the robot's kinematics.

Abstract

In this paper, we present a novel approach for learning bimanual manipulation actions from human demonstration by extracting spatial constraints between affordance regions, termed affordance constraints, of the objects involved. Affordance regions are defined as object parts that provide interaction possibilities to an agent. For example, the bottom of a bottle affords the object to be placed on a surface, while its spout affords the contained liquid to be poured. We propose a novel approach to learn changes of affordance constraints in human demonstration to construct spatial bimanual action models representing object interactions. To exploit the information encoded in these spatial bimanual action models, we formulate an optimization problem to determine optimal object configurations across multiple execution keypoints while taking into account the initial scene, the learned affordance constraints, and the robot's kinematics. We evaluate the approach in simulation with two example tasks (pouring drinks and rolling dough) and compare three different definitions of affordance constraints: (i) component-wise distances between affordance regions in Cartesian space, (ii) component-wise distances between affordance regions in cylindrical space, and (iii) degrees of satisfaction of manually defined symbolic spatial affordance constraints.

Paper Structure

This paper contains 19 sections, 6 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: We learn the spatial constraints between affordance regions (affordance constraints) from human demonstrations to obtain spatial bimanual action models. For execution on a robot, these are used to maximize the similarity between the learned affordance constraints and those present in the current scene subject to the robot's kinematics.
  • Figure 2: A simplified visual overview of the Spatial Bimanual Action Model.
  • Figure 3: An exemplary segmentation on synthetic data.
  • Figure 4: Segmenting changes of affordance constraints over time allow for generalization. Alongside the mean value, we also compute the standard deviation at the end of each segment. The colored areas show the confidence intervals given the standard deviation and mean value.
  • Figure 5: The cumulative number of keypoint candidates that fall in the time window of the corresponding bin. Butterworth filter and a peak detection are used to determine the optimal keypoints.
  • ...and 3 more figures