Table of Contents
Fetching ...

Physical Non-inertial Poser (PNP): Modeling Non-inertial Effects in Sparse-inertial Human Motion Capture

Xinyu Yi, Yuxiao Zhou, Feng Xu

TL;DR

The paper tackles sparse IMU motion capture by addressing non-inertial effects in the root frame, introducing Physical Non-inertial Poser (PNP) that models fictitious forces with an auto-regressive estimator to correct accelerations. A comprehensive data-synthesis pipeline generates realistic high-rate IMU signals, including calibration errors, to train the network and bridge the gap to real hardware. The key contributions are the fictitious-force modeling, the autoregressive estimator for a_fic, and the hardware-aware IMU synthesis, which together improve pose accuracy and translation drift on challenging motions. The approach enables robust, real-time mocap from as few as six IMUs in realistic settings, with potential applicability to consumer devices and outdoor scenarios where optical systems are impractical.

Abstract

Existing inertial motion capture techniques use the human root coordinate frame to estimate local poses and treat it as an inertial frame by default. We argue that when the root has linear acceleration or rotation, the root frame should be considered non-inertial theoretically. In this paper, we model the fictitious forces that are non-neglectable in a non-inertial frame by an auto-regressive estimator delicately designed following physics. With the fictitious forces, the force-related IMU measurement (accelerations) can be correctly compensated in the non-inertial frame and thus Newton's laws of motion are satisfied. In this case, the relationship between the accelerations and body motions is deterministic and learnable, and we train a neural network to model it for better motion capture. Furthermore, to train the neural network with synthetic data, we develop an IMU synthesis by simulation strategy to better model the noise model of IMU hardware and allow parameter tuning to fit different hardware. This strategy not only establishes the network training with synthetic data but also enables calibration error modeling to handle bad motion capture calibration, increasing the robustness of the system. Code is available at https://xinyu-yi.github.io/PNP/.

Physical Non-inertial Poser (PNP): Modeling Non-inertial Effects in Sparse-inertial Human Motion Capture

TL;DR

The paper tackles sparse IMU motion capture by addressing non-inertial effects in the root frame, introducing Physical Non-inertial Poser (PNP) that models fictitious forces with an auto-regressive estimator to correct accelerations. A comprehensive data-synthesis pipeline generates realistic high-rate IMU signals, including calibration errors, to train the network and bridge the gap to real hardware. The key contributions are the fictitious-force modeling, the autoregressive estimator for a_fic, and the hardware-aware IMU synthesis, which together improve pose accuracy and translation drift on challenging motions. The approach enables robust, real-time mocap from as few as six IMUs in realistic settings, with potential applicability to consumer devices and outdoor scenarios where optical systems are impractical.

Abstract

Existing inertial motion capture techniques use the human root coordinate frame to estimate local poses and treat it as an inertial frame by default. We argue that when the root has linear acceleration or rotation, the root frame should be considered non-inertial theoretically. In this paper, we model the fictitious forces that are non-neglectable in a non-inertial frame by an auto-regressive estimator delicately designed following physics. With the fictitious forces, the force-related IMU measurement (accelerations) can be correctly compensated in the non-inertial frame and thus Newton's laws of motion are satisfied. In this case, the relationship between the accelerations and body motions is deterministic and learnable, and we train a neural network to model it for better motion capture. Furthermore, to train the neural network with synthetic data, we develop an IMU synthesis by simulation strategy to better model the noise model of IMU hardware and allow parameter tuning to fit different hardware. This strategy not only establishes the network training with synthetic data but also enables calibration error modeling to handle bad motion capture calibration, increasing the robustness of the system. Code is available at https://xinyu-yi.github.io/PNP/.
Paper Structure (34 sections, 12 equations, 8 figures, 3 tables)

This paper contains 34 sections, 12 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Overview of our motion capture method. We first estimate the fictitious force in the non-inertial root frame (grey, left). Then, we transform the IMU measurements from the world frame to the root frame to estimate the local pose, while accounting for the fictitious force (green). Next, we estimate the human's global motion from the pose and the IMUs (purple). Finally, we employ a physics-based optimizer to refine the human motion (grey, right).
  • Figure 2: Illustration of the necessity of non-inertial effects modeling in human local pose estimation. Without fictitious accelerations, two different motions (left) own the same acceleration observations in the root frame (middle) while the accelerations are correctly observed with the consideration of the fictitious accelerations (right).
  • Figure 3: Overview of our IMU synthesis method. We first synthesize the 6DoF trajectory (position and orientation) of each IMU from the low frame-rate (60FPS) motion capture data. Then, we calculate the raw high frame-rate (180FPS) IMU signals including the acceleration, angular velocity, and magnetic field measurement from the trajectory. After adding sensor noise to the raw signals, we perform IMU fusion to get the orientation measurement. Finally, we simulate a T-pose calibration process, where the calibration error is added to the sensor readings.
  • Figure 4: Qualitative comparisons with prior works. The examples are picked from the TotalCapture TotalCapture dataset.
  • Figure 5: Qualitative evaluation on the proposed (a) fictitious acceleration and (b) the IMU synthesizing method. (a) We plot the joint position error in a sequence from the TotalCapture dataset and compare the reconstructed motions at two selected time intervals. (b) We visualize the predicted motion for the same sequence under both small and large calibration errors (DIP-calibration vs. official calibration).
  • ...and 3 more figures