Rapid Online Learning of Hip Exoskeleton Assistance Preferences
Giulia Ramella, Auke Ijspeert, Mohamed Bouri
TL;DR
This work tackles the problem of rapidly Personalizing hip exoskeleton assistance by replacing lengthy, model-based tuning with an online, reward-based learning framework driven by user preferences. It uses active pairwise comparisons of gait-cycle dependent torque profiles and a Bayesian learning loop where $R(\xi) = w^\top \Phi(\xi)$ and $P(c=\xi_c) = \frac{\exp{(R(\xi_c))}}{\sum_j \exp(R(\xi_j))}$ guide iteration, updated via Metropolis-Hastings within the APReL library. In eight healthy subjects, distinct preferred torque profiles emerge and are largely robust to perturbations; preferred profiles are linked to walking strategy and show reduced negative power relative to positive power, while preserving kinematic synergies. The approach demonstrates rapid online learning of user-specific rewards without pretraining, enabling practical, reward-based human-exoskeleton interaction in real time.
Abstract
Hip exoskeletons are increasing in popularity due to their effectiveness across various scenarios and their ability to adapt to different users. However, personalizing the assistance often requires lengthy tuning procedures and computationally intensive algorithms, and most existing methods do not incorporate user feedback. In this work, we propose a novel approach for rapidly learning users' preferences for hip exoskeleton assistance. We perform pairwise comparisons of distinct randomly generated assistive profiles, and collect participants preferences through active querying. Users' feedback is integrated into a preference-learning algorithm that updates its belief, learns a user-dependent reward function, and changes the assistive torque profiles accordingly. Results from eight healthy subjects display distinct preferred torque profiles, and users' choices remain consistent when compared to a perturbed profile. A comprehensive evaluation of users' preferences reveals a close relationship with individual walking strategies. The tested torque profiles do not disrupt kinematic joint synergies, and participants favor assistive torques that are synchronized with their movements, resulting in lower negative power from the device. This straightforward approach enables the rapid learning of users preferences and rewards, grounding future studies on reward-based human-exoskeleton interaction.
