Table of Contents
Fetching ...

Rapid Online Learning of Hip Exoskeleton Assistance Preferences

Giulia Ramella, Auke Ijspeert, Mohamed Bouri

TL;DR

This work tackles the problem of rapidly Personalizing hip exoskeleton assistance by replacing lengthy, model-based tuning with an online, reward-based learning framework driven by user preferences. It uses active pairwise comparisons of gait-cycle dependent torque profiles and a Bayesian learning loop where $R(\xi) = w^\top \Phi(\xi)$ and $P(c=\xi_c) = \frac{\exp{(R(\xi_c))}}{\sum_j \exp(R(\xi_j))}$ guide iteration, updated via Metropolis-Hastings within the APReL library. In eight healthy subjects, distinct preferred torque profiles emerge and are largely robust to perturbations; preferred profiles are linked to walking strategy and show reduced negative power relative to positive power, while preserving kinematic synergies. The approach demonstrates rapid online learning of user-specific rewards without pretraining, enabling practical, reward-based human-exoskeleton interaction in real time.

Abstract

Hip exoskeletons are increasing in popularity due to their effectiveness across various scenarios and their ability to adapt to different users. However, personalizing the assistance often requires lengthy tuning procedures and computationally intensive algorithms, and most existing methods do not incorporate user feedback. In this work, we propose a novel approach for rapidly learning users' preferences for hip exoskeleton assistance. We perform pairwise comparisons of distinct randomly generated assistive profiles, and collect participants preferences through active querying. Users' feedback is integrated into a preference-learning algorithm that updates its belief, learns a user-dependent reward function, and changes the assistive torque profiles accordingly. Results from eight healthy subjects display distinct preferred torque profiles, and users' choices remain consistent when compared to a perturbed profile. A comprehensive evaluation of users' preferences reveals a close relationship with individual walking strategies. The tested torque profiles do not disrupt kinematic joint synergies, and participants favor assistive torques that are synchronized with their movements, resulting in lower negative power from the device. This straightforward approach enables the rapid learning of users preferences and rewards, grounding future studies on reward-based human-exoskeleton interaction.

Rapid Online Learning of Hip Exoskeleton Assistance Preferences

TL;DR

This work tackles the problem of rapidly Personalizing hip exoskeleton assistance by replacing lengthy, model-based tuning with an online, reward-based learning framework driven by user preferences. It uses active pairwise comparisons of gait-cycle dependent torque profiles and a Bayesian learning loop where and guide iteration, updated via Metropolis-Hastings within the APReL library. In eight healthy subjects, distinct preferred torque profiles emerge and are largely robust to perturbations; preferred profiles are linked to walking strategy and show reduced negative power relative to positive power, while preserving kinematic synergies. The approach demonstrates rapid online learning of user-specific rewards without pretraining, enabling practical, reward-based human-exoskeleton interaction in real time.

Abstract

Hip exoskeletons are increasing in popularity due to their effectiveness across various scenarios and their ability to adapt to different users. However, personalizing the assistance often requires lengthy tuning procedures and computationally intensive algorithms, and most existing methods do not incorporate user feedback. In this work, we propose a novel approach for rapidly learning users' preferences for hip exoskeleton assistance. We perform pairwise comparisons of distinct randomly generated assistive profiles, and collect participants preferences through active querying. Users' feedback is integrated into a preference-learning algorithm that updates its belief, learns a user-dependent reward function, and changes the assistive torque profiles accordingly. Results from eight healthy subjects display distinct preferred torque profiles, and users' choices remain consistent when compared to a perturbed profile. A comprehensive evaluation of users' preferences reveals a close relationship with individual walking strategies. The tested torque profiles do not disrupt kinematic joint synergies, and participants favor assistive torques that are synchronized with their movements, resulting in lower negative power from the device. This straightforward approach enables the rapid learning of users preferences and rewards, grounding future studies on reward-based human-exoskeleton interaction.

Paper Structure

This paper contains 9 sections, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: Framework for the rapid online learning of preferred assistive torque provided by eWalk, a hip exoskeleton for partial assistance. Participants are actively queried on their preferred assistance through pairwise comparisons of distinct torque profiles. Their feedback is integrated into a preference-based algorithm which updates the belief distribution, learns a user-specif reward, and adjusts the torque profile in real-time.
  • Figure 2: (A): Hip torque profile parameterized using six features, which were adjusted in real-time by the preference-learning algorithm. (B): Schematic diagram of our experiment, where users wear the eWalk hip exoskeleton and the motion capture system XSens. Active querying is used to collect user's preference between consecutive pairs of torque profiles. Individual preferences are integrated into our preference-learning algorithm, that updates the human's belief distribution and adapts the torque profile. (C): Set of torque profiles tested on a representative subject, and the final chosen preferred assistive torque (bold blue line).
  • Figure 3: Preferred assistive torque profiles (above), and corresponding power profiles (below), at the end of the pairwise comparisons, for each participant to the experiment. The "Initial" torque profile is used during the first session of familiarization with the exoskeleton, and kept the same across subjects.
  • Figure 4: Final preferred torque profile features at the conclusion of the learning process for each subject. The gray areas represent the possible range of values for each feature.
  • Figure 5: Normalized weights of the reward function over the pairwise comparisons for two representative subjects (S2 above and S3 below). At the beginning, the weights are initialized with random values. The algorithm progressively learns the values based on user's feedback.
  • ...and 4 more figures