Physics-Informed Learning for Human Whole-Body Kinematics Prediction via Sparse IMUs
Cheng Guo, Giuseppe L'Erario, Giulio Romualdi, Mattia Leonori, Marta Lorenzini, Arash Ajoudani, Daniele Pucci
TL;DR
This work tackles the problem of predicting future whole-body human kinematics from sparse inertial data in human-robot collaboration. It introduces the Physics-Informed Neural Kinematics Predictor (PINKP), which couples a neural network with forward and differential kinematics losses ($L_{ ext{FK}}$, $L_{ ext{DK}}$) and uses a joint state buffer to support autoregressive inference. A joint kinematics optimizer refines the first-step prediction at inference, updating the buffer to promote smooth transitions. Evaluations on a self-collected 17-IMU dataset show improved accuracy and real-time performance over LSTM, TCN, and TIP baselines, with robust generalization to unseen subjects and smoother motion sequences.
Abstract
Accurate and physically feasible human motion prediction is crucial for safe and seamless human-robot collaboration. While recent advancements in human motion capture enable real-time pose estimation, the practical value of many existing approaches is limited by the lack of future predictions and consideration of physical constraints. Conventional motion prediction schemes rely heavily on past poses, which are not always available in real-world scenarios. To address these limitations, we present a physics-informed learning framework that integrates domain knowledge into both training and inference to predict human motion using inertial measurements from only 5 IMUs. We propose a network that accounts for the spatial characteristics of human movements. During training, we incorporate forward and differential kinematics functions as additional loss components to regularize the learned joint predictions. At the inference stage, we refine the prediction from the previous iteration to update a joint state buffer, which is used as extra inputs to the network. Experimental results demonstrate that our approach achieves high accuracy, smooth transitions between motions, and generalizes well to unseen subjects
