Table of Contents
Fetching ...

Improving Sparse IMU-based Motion Capture with Motion Label Smoothing

Zhaorui Meng, Lu Yin, Yangqing Hou, Anjun Chen, Shihui Guo, Yipeng Qin

TL;DR

This work introduces motion label smoothing to regularize sparse IMU-based motion capture, addressing a gap where regularization has been underexplored. It identifies three intrinsic properties of human motion—temporal smoothness, joint correlation, and low-frequency dominance—and argues that naive entropy-augmentation methods disrupt these properties. To solve this, the authors design a skeleton-based Perlin noise (sk-Perlin) that increases label entropy while respecting biomechanical constraints, implemented via amplitude-decoupling to preserve smoothness and correlations. Plug-and-play across three state-of-the-art methods on four real IMU datasets demonstrates consistent gains in pose accuracy and robustness, establishing motion label smoothing as a practical AI toolkit addition for sparse IMU motion capture.

Abstract

Sparse Inertial Measurement Units (IMUs) based human motion capture has gained significant momentum, driven by the adaptation of fundamental AI tools such as recurrent neural networks (RNNs) and transformers that are tailored for temporal and spatial modeling. Despite these achievements, current research predominantly focuses on pipeline and architectural designs, with comparatively little attention given to regularization methods, highlighting a critical gap in developing a comprehensive AI toolkit for this task. To bridge this gap, we propose motion label smoothing, a novel method that adapts the classic label smoothing strategy from classification to the sparse IMU-based motion capture task. Specifically, we first demonstrate that a naive adaptation of label smoothing, including simply blending a uniform vector or a ``uniform'' motion representation (e.g., dataset-average motion or a canonical T-pose), is suboptimal; and argue that a proper adaptation requires increasing the entropy of the smoothed labels. Second, we conduct a thorough analysis of human motion labels, identifying three critical properties: 1) Temporal Smoothness, 2) Joint Correlation, and 3) Low-Frequency Dominance, and show that conventional approaches to entropy enhancement (e.g., blending Gaussian noise) are ineffective as they disrupt these properties. Finally, we propose the blend of a novel skeleton-based Perlin noise for motion label smoothing, designed to raise label entropy while satisfying motion properties. Extensive experiments applying our motion label smoothing to three state-of-the-art methods across four real-world IMU datasets demonstrate its effectiveness and robust generalization (plug-and-play) capability.

Improving Sparse IMU-based Motion Capture with Motion Label Smoothing

TL;DR

This work introduces motion label smoothing to regularize sparse IMU-based motion capture, addressing a gap where regularization has been underexplored. It identifies three intrinsic properties of human motion—temporal smoothness, joint correlation, and low-frequency dominance—and argues that naive entropy-augmentation methods disrupt these properties. To solve this, the authors design a skeleton-based Perlin noise (sk-Perlin) that increases label entropy while respecting biomechanical constraints, implemented via amplitude-decoupling to preserve smoothness and correlations. Plug-and-play across three state-of-the-art methods on four real IMU datasets demonstrates consistent gains in pose accuracy and robustness, establishing motion label smoothing as a practical AI toolkit addition for sparse IMU motion capture.

Abstract

Sparse Inertial Measurement Units (IMUs) based human motion capture has gained significant momentum, driven by the adaptation of fundamental AI tools such as recurrent neural networks (RNNs) and transformers that are tailored for temporal and spatial modeling. Despite these achievements, current research predominantly focuses on pipeline and architectural designs, with comparatively little attention given to regularization methods, highlighting a critical gap in developing a comprehensive AI toolkit for this task. To bridge this gap, we propose motion label smoothing, a novel method that adapts the classic label smoothing strategy from classification to the sparse IMU-based motion capture task. Specifically, we first demonstrate that a naive adaptation of label smoothing, including simply blending a uniform vector or a ``uniform'' motion representation (e.g., dataset-average motion or a canonical T-pose), is suboptimal; and argue that a proper adaptation requires increasing the entropy of the smoothed labels. Second, we conduct a thorough analysis of human motion labels, identifying three critical properties: 1) Temporal Smoothness, 2) Joint Correlation, and 3) Low-Frequency Dominance, and show that conventional approaches to entropy enhancement (e.g., blending Gaussian noise) are ineffective as they disrupt these properties. Finally, we propose the blend of a novel skeleton-based Perlin noise for motion label smoothing, designed to raise label entropy while satisfying motion properties. Extensive experiments applying our motion label smoothing to three state-of-the-art methods across four real-world IMU datasets demonstrate its effectiveness and robust generalization (plug-and-play) capability.

Paper Structure

This paper contains 47 sections, 7 theorems, 39 equations, 5 figures, 6 tables.

Key Result

Proposition 1

Applying Gaussian noise or uniform noise with sufficiently large noise amplitude does not satisfy Property 1 (Temporal Smoothness).

Figures (5)

  • Figure 1: A live comparison between the state-of-the-art sparse IMU-based motion capture system, GlobalPose yi2025improving (left, red), and its improved variant enhanced with our motion label smoothing technique (right, green) clearly illustrates the effectiveness of our method.
  • Figure 2: (a) Noise terrain of three types of noise in the frame-joint coordinate system, where only Perlin noise exhibits continuity across frames and correlation across joints. (b) Power spectral density (PSD) of the three types of noise, where only Perlin noise is dominated by low-frequency components.
  • Figure 3: Overview of our Motion Label Smoothing method. The ground truth motion label $R$ is blended with a carefully-constructed skeleton-based Perlin noise $u$, which satisfy Properties \ref{['property1']}, \ref{['property2']}, \ref{['property3']}, as well as having a sufficiently large noise amplitude for effective regularization.
  • Figure 4: Qualitative comparisons with baseline methods. Examples are from the TotalCapture and CIP datasets.
  • Figure 5: Comparison of translation drifting error on the TotalCapture dataset. We plot the global position error accumulation curve with respect to the ground truth traveled distance. A lower curve indicates smaller drift.

Theorems & Definitions (14)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Proposition 4
  • proof
  • Proposition 5
  • proof
  • ...and 4 more