Learning Human-Robot Handshaking Preferences for Quadruped Robots

Alessandra Chappuis; Guillaume Bellegarda; Auke Ijspeert

Learning Human-Robot Handshaking Preferences for Quadruped Robots

Alessandra Chappuis, Guillaume Bellegarda, Auke Ijspeert

TL;DR

The study addresses learning user-specific handshaking preferences for quadruped robots to foster social trust in real-world human–robot interactions. It introduces a parameterized handshake model with amplitude $a$, frequency $f$, and stiffness $K_p$, controlled by a Cartesian PD loop, and optimized through active preference-based reward learning (APReL) using 10 pairwise comparisons per user across 25 participants. The reward is $R(\xi)=\omega^{\top} \Phi(\xi)$, updated with Metropolis-Hastings Bayesian inference and a softmax user response model, yielding personalized handshake parameters that reduce amplitude/frequency errors, DTW, and torque, while increasing user satisfaction (19/25 happy, 5 neutral). The work demonstrates rapid personalization of a social gesture on a quadruped robot, with empirical evidence of improved synchronization and energy efficiency, and provides insights into gender differences and passive versus active handshake dynamics. These findings have practical implications for deploying socially capable quadrupeds in public or service settings where trust and natural interaction are crucial.

Abstract

Quadruped robots are showing impressive abilities to navigate the real world. If they are to become more integrated into society, social trust in interactions with humans will become increasingly important. Additionally, robots will need to be adaptable to different humans based on individual preferences. In this work, we study the social interaction task of learning optimal handshakes for quadruped robots based on user preferences. While maintaining balance on three legs, we parameterize handshakes with a Central Pattern Generator consisting of an amplitude, frequency, stiffness, and duration. Through 10 binary choices between handshakes, we learn a belief model to fit individual preferences for 25 different subjects. Our results show that this is an effective strategy, with 76% of users feeling happy with their identified optimal handshake parameters, and 20% feeling neutral. Moreover, compared with random and test handshakes, the optimized handshakes have significantly decreased errors in amplitude and frequency, lower Dynamic Time Warping scores, and improved energy efficiency, all of which indicate robot synchronization to the user's preferences. Video results can be found at https://youtu.be/elvPv8mq1KM .

Learning Human-Robot Handshaking Preferences for Quadruped Robots

TL;DR

, frequency

, and stiffness

, controlled by a Cartesian PD loop, and optimized through active preference-based reward learning (APReL) using 10 pairwise comparisons per user across 25 participants. The reward is

, updated with Metropolis-Hastings Bayesian inference and a softmax user response model, yielding personalized handshake parameters that reduce amplitude/frequency errors, DTW, and torque, while increasing user satisfaction (19/25 happy, 5 neutral). The work demonstrates rapid personalization of a social gesture on a quadruped robot, with empirical evidence of improved synchronization and energy efficiency, and provides insights into gender differences and passive versus active handshake dynamics. These findings have practical implications for deploying socially capable quadrupeds in public or service settings where trust and natural interaction are crucial.

Abstract

Paper Structure (22 sections, 6 equations, 8 figures, 3 tables)

This paper contains 22 sections, 6 equations, 8 figures, 3 tables.

Introduction
Human-Robot Interaction
Quadruped Robots
Contribution
Methods
Generating and Parameterizing Handshakes
Control
Handshake Overview
Active Preference-Based Reward Learning
Experimental Procedure and Evaluation Metrics
Experimental Procedure
Evaluation Metrics
Results and Discussion
Sample Handshakes and Synchrony
Learning Process and User Satisfaction
...and 7 more sections

Figures (8)

Figure 1: Learning human-preferred handshaking on the Unitree Go1.
Figure 2: Control diagram for mapping preferred parameters to joint torques for handshakes with a quadruped robot. (a): Optimized handshake parameters amplitude ($a$), frequency ($f$), and Cartesian stiffness ($K_p$) create a desired foot trajectory $\bm{p}_d$ which is tracked with Cartesian PD control. The solid lines operate at 1 kHz, while the dotted line indicates the new handshake parameters, which are sent when a grasp is detected. (b): the world frame amplitude $a$ is mapped to the robot frame based on the robot pitch $\theta$.
Figure 3: Robot positions during handshaking experiments. (1): standing at rest. (2): sitting on rear legs. (3): raising the front right foot and waiting for a grasp. (4): performing a user handshake. (5): returning to the nominal shaking position and waiting for the next grasp.
Figure 4: Experimental procedure. The robot first sits on its rear legs and lifts its front right foot to prepare for user handshakes. The training block (shaded blue) consists of 10 trials of two sets of random handshake parameters each, with the user giving feedback on which handshake they prefer. The belief model is updated after each pair of handshakes to reflect the user preferences. After 10 trials, the user is shown the parameters consisting of their identified preferences, and rate their satisfaction with happy, neutral, or displeased. Lastly, we perform a validation study where the user alternates between their optimally identified parameters, and high/low perturbations to a single parameter at a time.
Figure 5: Comparison of three distinct handshakes. Plots of the foot position along the z-axis (left) and the joint torques (right) as a function of time. For each handshake the corresponding parameters are indicated: $\{a, f, K_p\}$. (A): synchronous motion between the user and the robot, $\{3.5, 1.53, 114.87\}$. (B): asynchronous motion, $\{9.4, 1.81, 143.71\}$. (C): out-of-phase motion, with the robot and the user moving in opposite intended directions, $\{3.5, 2.05, 73.3\}$.
...and 3 more figures

Learning Human-Robot Handshaking Preferences for Quadruped Robots

TL;DR

Abstract

Learning Human-Robot Handshaking Preferences for Quadruped Robots

Authors

TL;DR

Abstract

Table of Contents

Figures (8)