Table of Contents
Fetching ...

Toward Global Intent Inference for Human Motion by Inverse Reinforcement Learning

Sarmad Mehrdad, Maxime Sabbah, Vincent Bonnet, Ludovic Righetti

TL;DR

Overall, a single subject- and posture-agnostic time-varying cost function is shown to predict human reaching trajectories with high accuracy, supporting the existence of a unified optimality principle governing this class of movements.

Abstract

This paper investigates whether a single, unified cost function can explain and predict human reaching movements, in contrast with existing approaches that rely on subject- or posture-specific optimization criteria. Using the Minimal Observation Inverse Reinforcement Learning (MO-IRL) algorithm, together with a seven-dimensional set of candidate cost terms, we efficiently estimate time-varying cost weights for a standard planar reaching task. MO-IRL provides orders-of-magnitude faster convergence than bilevel formulations, while using only a fraction of the available data, enabling the practical exploration of time-varying cost structures. Three levels of generality are evaluated: Subject-Dependent Posture-Dependent, Subject-Dependent Posture-Independent, and Subject-Independent Posture-Independent. Across all cases, time-varying weights substantially improve trajectory reconstruction, yielding an average 27% reduction in RMSE compared to the baseline. The inferred costs consistently highlight a dominant role for joint-acceleration regulation, complemented by smaller contributions from torque-change smoothness. Overall, a single subject- and posture-agnostic time-varying cost function is shown to predict human reaching trajectories with high accuracy, supporting the existence of a unified optimality principle governing this class of movements.

Toward Global Intent Inference for Human Motion by Inverse Reinforcement Learning

TL;DR

Overall, a single subject- and posture-agnostic time-varying cost function is shown to predict human reaching trajectories with high accuracy, supporting the existence of a unified optimality principle governing this class of movements.

Abstract

This paper investigates whether a single, unified cost function can explain and predict human reaching movements, in contrast with existing approaches that rely on subject- or posture-specific optimization criteria. Using the Minimal Observation Inverse Reinforcement Learning (MO-IRL) algorithm, together with a seven-dimensional set of candidate cost terms, we efficiently estimate time-varying cost weights for a standard planar reaching task. MO-IRL provides orders-of-magnitude faster convergence than bilevel formulations, while using only a fraction of the available data, enabling the practical exploration of time-varying cost structures. Three levels of generality are evaluated: Subject-Dependent Posture-Dependent, Subject-Dependent Posture-Independent, and Subject-Independent Posture-Independent. Across all cases, time-varying weights substantially improve trajectory reconstruction, yielding an average 27% reduction in RMSE compared to the baseline. The inferred costs consistently highlight a dominant role for joint-acceleration regulation, complemented by smaller contributions from torque-change smoothness. Overall, a single subject- and posture-agnostic time-varying cost function is shown to predict human reaching trajectories with high accuracy, supporting the existence of a unified optimality principle governing this class of movements.
Paper Structure (12 sections, 4 equations, 6 figures, 4 tables)

This paper contains 12 sections, 4 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: (a) Biomechanical model definition, showing the beginning and the end of the pointing task. (b) Five different initial postures for the pointing task berret2011.
  • Figure 2: Averaged estimated weights, and cost contribution, across all subjects for initial postures 1-5, produced through SDPD analysis. Plots show a prominent importance of the Joint Acceleration feature in the cost function ($\Phi_4$).
  • Figure 3: RMSE values boxplot comparison of all investigated cases against the baseline, averaged across all subjects. Red line in each boxplot indicates the mean value.
  • Figure 4: RMSE values (deg) boxplot comparison of all methods against the baseline, averaged across all initial postures. Red line in each boxplot indicates the mean value.
  • Figure 5: Weights and cost contributions of general cost weights proposed for the reaching task in SIPI case. The RMSE comparison boxplots between all existing demonstrations and predictions are presented in the figure at the bottom.
  • ...and 1 more figures