Table of Contents
Fetching ...

Transition State Clustering for Interaction Segmentation and Learning

Fabian Hahne, Vignesh Prasad, Alap Kshirsagar, Dorothea Koert, Ruth Maria Stock-Homburg, Jan Peters, Georgia Chalvatzaki

TL;DR

This paper tackles segmentation errors in learning Human-Robot Interactions when using only human observations by introducing Transition State Clustering (TSC) on top of a Hidden Markov Model (HMM). The method trains an HMM on joint human-robot trajectories and then learns a second HMM over observations near transition boundaries, capturing transition states that commonly cause misclassifications. By using the forward variable computed from human observations and conditioning robot outputs through the augmented model, the approach yields improved trajectory predictions across handshake and fistbump tasks, with notable reductions in mean squared error and minimal overhead. This hierarchical HMM-TSC framework enhances segmentation fidelity and predictive performance in HRI, enabling more reliable robot responses to human actions.

Abstract

Hidden Markov Models with an underlying Mixture of Gaussian structure have proven effective in learning Human-Robot Interactions from demonstrations for various interactive tasks via Gaussian Mixture Regression. However, a mismatch occurs when segmenting the interaction using only the observed state of the human compared to the joint state of the human and the robot. To enhance this underlying segmentation and subsequently the predictive abilities of such Gaussian Mixture-based approaches, we take a hierarchical approach by learning an additional mixture distribution on the states at the transition boundary. This helps prevent misclassifications that usually occur in such states. We find that our framework improves the performance of the underlying Gaussian Mixture-based approach, which we evaluate on various interactive tasks such as handshaking and fistbumps.

Transition State Clustering for Interaction Segmentation and Learning

TL;DR

This paper tackles segmentation errors in learning Human-Robot Interactions when using only human observations by introducing Transition State Clustering (TSC) on top of a Hidden Markov Model (HMM). The method trains an HMM on joint human-robot trajectories and then learns a second HMM over observations near transition boundaries, capturing transition states that commonly cause misclassifications. By using the forward variable computed from human observations and conditioning robot outputs through the augmented model, the approach yields improved trajectory predictions across handshake and fistbump tasks, with notable reductions in mean squared error and minimal overhead. This hierarchical HMM-TSC framework enhances segmentation fidelity and predictive performance in HRI, enabling more reliable robot responses to human actions.

Abstract

Hidden Markov Models with an underlying Mixture of Gaussian structure have proven effective in learning Human-Robot Interactions from demonstrations for various interactive tasks via Gaussian Mixture Regression. However, a mismatch occurs when segmenting the interaction using only the observed state of the human compared to the joint state of the human and the robot. To enhance this underlying segmentation and subsequently the predictive abilities of such Gaussian Mixture-based approaches, we take a hierarchical approach by learning an additional mixture distribution on the states at the transition boundary. This helps prevent misclassifications that usually occur in such states. We find that our framework improves the performance of the underlying Gaussian Mixture-based approach, which we evaluate on various interactive tasks such as handshaking and fistbumps.
Paper Structure (7 sections, 5 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 7 sections, 5 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: An example of the hidden states (Human - red, Robot - blue) learned by an HMM and the learned transition state clusters (Human - magenta, Robot - cyan) from demonstrations of handshaking.
  • Figure 2: An overview of our proposed approach. Given demonstrations of an interaction (left), such as end-effector trajectories when performing a handshake, we first learn an HMM over the demonstrations (middle) to segment the interaction into underlying phases (red - human, blue - robot). Based on the learned HMM, we subsequently learn an additional distribution over the observations near the transition boundaries of the HMM hidden states, as shown in the image on the right (magenta - human, cyan - robot).
  • Figure 3: Example of the predicted segments when using the combined DoFs of the human and the robot (top row) and only the human DoFs (bottom row).
  • Figure 4: This figure shows example 3D plots of reconstructed trajectories for the different interactions considered in the work. Each plot consists of the input trajectory of the human, the ground truth trajectory for the robot and the reconstructed trajectory for the robot, along with the Gaussian states of the HMM and the transition state clusters.