Table of Contents
Fetching ...

3HANDS Dataset: Learning from Humans for Generating Naturalistic Handovers with Supernumerary Robotic Limbs

Artin Saberpour Abadian, Yi-Chi Liao, Ata Otaran, Rishabh Dabral, Marie Muehlhaus, Christian Theobalt, Martin Schmitz, Jürgen Steimle

TL;DR

The paper addresses the challenge of enabling naturalistic handovers in hip-mounted supernumerary robotic limbs (SRLs) by introducing the 3HANDS dataset, which captures 946 asymmetric, intimate-space interactions between a primary user and a second person enacting an SRL across 12 daily activities. It presents three CVAE-based models trained on 3HANDS to (i) generate naturalistic handover trajectories, (ii) predict the region of transfer (ROT) where handovers occur, and (iii) predict when to initiate a handover from implicit cues. A two-stage training scheme and a participant-level train-test split demonstrate the models can generate plausible trajectories (MAE ~$2.1$–$2.7$ cm non-autoregressive) and ROTs (MAE $4.02$–$8.04$ cm; MEAE $0.0002$–$0.004$ rad), with a VR user study showing improved perceived naturalness, comfort, timeliness, and appropriateness compared to a baseline. The dataset and models collectively advance data-driven SRL interactions in intimate spaces, and the authors release 3HANDS to support future research in human-robot handovers and peripersonal-space robotics.

Abstract

Supernumerary robotic limbs (SRLs) are robotic structures integrated closely with the user's body, which augment human physical capabilities and necessitate seamless, naturalistic human-machine interaction. For effective assistance in physical tasks, enabling SRLs to hand over objects to humans is crucial. Yet, designing heuristic-based policies for robots is time-consuming, difficult to generalize across tasks, and results in less human-like motion. When trained with proper datasets, generative models are powerful alternatives for creating naturalistic handover motions. We introduce 3HANDS, a novel dataset of object handover interactions between a participant performing a daily activity and another participant enacting a hip-mounted SRL in a naturalistic manner. 3HANDS captures the unique characteristics of SRL interactions: operating in intimate personal space with asymmetric object origins, implicit motion synchronization, and the user's engagement in a primary task during the handover. To demonstrate the effectiveness of our dataset, we present three models: one that generates naturalistic handover trajectories, another that determines the appropriate handover endpoints, and a third that predicts the moment to initiate a handover. In a user study (N=10), we compare the handover interaction performed with our method compared to a baseline. The findings show that our method was perceived as significantly more natural, less physically demanding, and more comfortable.

3HANDS Dataset: Learning from Humans for Generating Naturalistic Handovers with Supernumerary Robotic Limbs

TL;DR

The paper addresses the challenge of enabling naturalistic handovers in hip-mounted supernumerary robotic limbs (SRLs) by introducing the 3HANDS dataset, which captures 946 asymmetric, intimate-space interactions between a primary user and a second person enacting an SRL across 12 daily activities. It presents three CVAE-based models trained on 3HANDS to (i) generate naturalistic handover trajectories, (ii) predict the region of transfer (ROT) where handovers occur, and (iii) predict when to initiate a handover from implicit cues. A two-stage training scheme and a participant-level train-test split demonstrate the models can generate plausible trajectories (MAE ~ cm non-autoregressive) and ROTs (MAE cm; MEAE rad), with a VR user study showing improved perceived naturalness, comfort, timeliness, and appropriateness compared to a baseline. The dataset and models collectively advance data-driven SRL interactions in intimate spaces, and the authors release 3HANDS to support future research in human-robot handovers and peripersonal-space robotics.

Abstract

Supernumerary robotic limbs (SRLs) are robotic structures integrated closely with the user's body, which augment human physical capabilities and necessitate seamless, naturalistic human-machine interaction. For effective assistance in physical tasks, enabling SRLs to hand over objects to humans is crucial. Yet, designing heuristic-based policies for robots is time-consuming, difficult to generalize across tasks, and results in less human-like motion. When trained with proper datasets, generative models are powerful alternatives for creating naturalistic handover motions. We introduce 3HANDS, a novel dataset of object handover interactions between a participant performing a daily activity and another participant enacting a hip-mounted SRL in a naturalistic manner. 3HANDS captures the unique characteristics of SRL interactions: operating in intimate personal space with asymmetric object origins, implicit motion synchronization, and the user's engagement in a primary task during the handover. To demonstrate the effectiveness of our dataset, we present three models: one that generates naturalistic handover trajectories, another that determines the appropriate handover endpoints, and a third that predicts the moment to initiate a handover. In a user study (N=10), we compare the handover interaction performed with our method compared to a baseline. The findings show that our method was perceived as significantly more natural, less physically demanding, and more comfortable.

Paper Structure

This paper contains 49 sections, 5 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: The 3HANDS dataset comprises an extensive collection of human motion data of asymmetric object handovers between users and a human-enacted third arm, which assists an ongoing activity by handing in or taking away objects at an intimate distance to the user. It contains recordings of 946 interactions captured with 12 participant pairings while performing 12 daily activities. The dataset comprises rigged skeleton data of full body (69 joints) and hands (21 joints). We demonstrate the dataset's utility to train state-of-the-art machine learning models for three essential steps in the handover activity: generating naturalistic handover trajectories, predicting the location of the handover, and identifying the intent to initialize a handover.
  • Figure 2: Setup of the asymmetric handover task. Left: the primary participant was standing and performing the primary activity, while the second participant enacted a robotic arm for handing over an object. Right: we generate rigged skeletons of both humans, including their articulated hands.
  • Figure 3: Distribution of the locations where the object was handed over between participants. The points are presented in the user's hips coordinate system. (left) shows the distribution from the top view (head at origin, facing towards right) and (right) from the front view (hip at origin, user facing inwards the plane).
  • Figure 4: Distribution of the palm over the entire dataset for performing the 12+1 activities. The points are presented in the user's hips coordinate system. (left) shows the top-view (head at origin, facing towards right), (right) the frontal view (hip at origin, user facing inwards the plane). The color encodes the left and right hands. The unit is meters.
  • Figure 5: Overview of our proposed model architecture for generating the trajectory of a handover based on the motion dynamics encoded into the model's latent space.
  • ...and 2 more figures