Improving Sparse IMU-based Motion Capture with Motion Label Smoothing
Zhaorui Meng, Lu Yin, Yangqing Hou, Anjun Chen, Shihui Guo, Yipeng Qin
TL;DR
This work introduces motion label smoothing to regularize sparse IMU-based motion capture, addressing a gap where regularization has been underexplored. It identifies three intrinsic properties of human motion—temporal smoothness, joint correlation, and low-frequency dominance—and argues that naive entropy-augmentation methods disrupt these properties. To solve this, the authors design a skeleton-based Perlin noise (sk-Perlin) that increases label entropy while respecting biomechanical constraints, implemented via amplitude-decoupling to preserve smoothness and correlations. Plug-and-play across three state-of-the-art methods on four real IMU datasets demonstrates consistent gains in pose accuracy and robustness, establishing motion label smoothing as a practical AI toolkit addition for sparse IMU motion capture.
Abstract
Sparse Inertial Measurement Units (IMUs) based human motion capture has gained significant momentum, driven by the adaptation of fundamental AI tools such as recurrent neural networks (RNNs) and transformers that are tailored for temporal and spatial modeling. Despite these achievements, current research predominantly focuses on pipeline and architectural designs, with comparatively little attention given to regularization methods, highlighting a critical gap in developing a comprehensive AI toolkit for this task. To bridge this gap, we propose motion label smoothing, a novel method that adapts the classic label smoothing strategy from classification to the sparse IMU-based motion capture task. Specifically, we first demonstrate that a naive adaptation of label smoothing, including simply blending a uniform vector or a ``uniform'' motion representation (e.g., dataset-average motion or a canonical T-pose), is suboptimal; and argue that a proper adaptation requires increasing the entropy of the smoothed labels. Second, we conduct a thorough analysis of human motion labels, identifying three critical properties: 1) Temporal Smoothness, 2) Joint Correlation, and 3) Low-Frequency Dominance, and show that conventional approaches to entropy enhancement (e.g., blending Gaussian noise) are ineffective as they disrupt these properties. Finally, we propose the blend of a novel skeleton-based Perlin noise for motion label smoothing, designed to raise label entropy while satisfying motion properties. Extensive experiments applying our motion label smoothing to three state-of-the-art methods across four real-world IMU datasets demonstrate its effectiveness and robust generalization (plug-and-play) capability.
