Transition State Clustering for Interaction Segmentation and Learning
Fabian Hahne, Vignesh Prasad, Alap Kshirsagar, Dorothea Koert, Ruth Maria Stock-Homburg, Jan Peters, Georgia Chalvatzaki
TL;DR
This paper tackles segmentation errors in learning Human-Robot Interactions when using only human observations by introducing Transition State Clustering (TSC) on top of a Hidden Markov Model (HMM). The method trains an HMM on joint human-robot trajectories and then learns a second HMM over observations near transition boundaries, capturing transition states that commonly cause misclassifications. By using the forward variable computed from human observations and conditioning robot outputs through the augmented model, the approach yields improved trajectory predictions across handshake and fistbump tasks, with notable reductions in mean squared error and minimal overhead. This hierarchical HMM-TSC framework enhances segmentation fidelity and predictive performance in HRI, enabling more reliable robot responses to human actions.
Abstract
Hidden Markov Models with an underlying Mixture of Gaussian structure have proven effective in learning Human-Robot Interactions from demonstrations for various interactive tasks via Gaussian Mixture Regression. However, a mismatch occurs when segmenting the interaction using only the observed state of the human compared to the joint state of the human and the robot. To enhance this underlying segmentation and subsequently the predictive abilities of such Gaussian Mixture-based approaches, we take a hierarchical approach by learning an additional mixture distribution on the states at the transition boundary. This helps prevent misclassifications that usually occur in such states. We find that our framework improves the performance of the underlying Gaussian Mixture-based approach, which we evaluate on various interactive tasks such as handshaking and fistbumps.
