Table of Contents
Fetching ...

PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture

Zhuojun Li, Chun Yu, Chen Liang, Yuanchun Shi

TL;DR

Experiments show that PoseAugment outperforms previous data augmentation and pose generation methods in terms of motion capture accuracy, revealing a strong potential of the method to alleviate the data collection burden for IMU-based motion capture and related tasks driven by human poses.

Abstract

The data scarcity problem is a crucial factor that hampers the model performance of IMU-based human motion capture. However, effective data augmentation for IMU-based motion capture is challenging, since it has to capture the physical relations and constraints of the human body, while maintaining the data distribution and quality. We propose PoseAugment, a novel pipeline incorporating VAE-based pose generation and physical optimization. Given a pose sequence, the VAE module generates infinite poses with both high fidelity and diversity, while keeping the data distribution. The physical module optimizes poses to satisfy physical constraints with minimal motion restrictions. High-quality IMU data are then synthesized from the augmented poses for training motion capture models. Experiments show that PoseAugment outperforms previous data augmentation and pose generation methods in terms of motion capture accuracy, revealing a strong potential of our method to alleviate the data collection burden for IMU-based motion capture and related tasks driven by human poses.

PoseAugment: Generative Human Pose Data Augmentation with Physical Plausibility for IMU-based Motion Capture

TL;DR

Experiments show that PoseAugment outperforms previous data augmentation and pose generation methods in terms of motion capture accuracy, revealing a strong potential of the method to alleviate the data collection burden for IMU-based motion capture and related tasks driven by human poses.

Abstract

The data scarcity problem is a crucial factor that hampers the model performance of IMU-based human motion capture. However, effective data augmentation for IMU-based motion capture is challenging, since it has to capture the physical relations and constraints of the human body, while maintaining the data distribution and quality. We propose PoseAugment, a novel pipeline incorporating VAE-based pose generation and physical optimization. Given a pose sequence, the VAE module generates infinite poses with both high fidelity and diversity, while keeping the data distribution. The physical module optimizes poses to satisfy physical constraints with minimal motion restrictions. High-quality IMU data are then synthesized from the augmented poses for training motion capture models. Experiments show that PoseAugment outperforms previous data augmentation and pose generation methods in terms of motion capture accuracy, revealing a strong potential of our method to alleviate the data collection burden for IMU-based motion capture and related tasks driven by human poses.
Paper Structure (40 sections, 11 equations, 5 figures, 2 tables)

This paper contains 40 sections, 11 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Method overview. Given an original pose sequence $\boldsymbol{X}_{ori}=\{\boldsymbol{x}_1,\dots\boldsymbol{x}_T\}$, we learn a VAE model to generate new poses $\boldsymbol{X}'$ frame-by-frame autoregressively. It captures the motion variance and can generate infinite poses within this distribution. Then, the motion jitter and artifacts are optimized by solving a quadratic optimization problem, which is based on a dual position ($\boldsymbol{p}$) and rotation ($\boldsymbol{\theta}$) PD controller and physical constraints on reaction forces $\boldsymbol{\lambda}$ and torques $\boldsymbol{\tau}$. The final augmented poses $\boldsymbol{X}_{aug}$ can be used to augment the dataset by synthesizing IMU data.
  • Figure 2: Visualization of the motion throwing a handball. 10 motion sequences are generated by MotionAug, ACTOR, MDM-M2M, MDM-T2M, and our method.
  • Figure 3: The reaction force estimation of climbing stairs. We visualized the force vectors on two feet by the blue arrows, and the time sequences of vertical reaction forces. PIP fails when the subject is off-ground, while our method does not have this limitation.
  • Figure 4: The VAE model structure details. Two adjacent frames are first input to the encoder with two separate residual blocks. After reparameterization, predictions of the current frame $\boldsymbol{x}_{t}'$ will be reconstructed by the decoder with the MoE architecture.
  • Figure 5: More poses generated by PoseAugment, including various motion types. In each subfigure, one ground truth pose (green) and 9 augmented poses (red) are visualized.