FIP: Endowing Robust Motion Capture on Daily Garment by Fusing Flex and Inertial Sensors
Jiawei Fang, Ruonan Zheng, Yuanyao, Xiaoxia Gao, Chengxu Zuo, Shihui Guo, Yiyue Luo
TL;DR
FIP advances clothes-based MoCap by fusing flex sensors with IMUs in everyday garments and addressing sensor displacements through a triad of components: a Displacement Latent Diffusion Model (DLDM) to synthesize inertial disturbances, a Physics-informed Calibrator (PIC) to correct flex-sensor primary displacement, and a Pose Fusion Predictor (PFP) to fuse multimodal readings. The approach is trained on simulated displacement data and augmented by diffusion-based sampling, then validated on real-device data showing significant improvements over state-of-the-art real-time IMU methods in angular, elbow, and positional errors. The system runs at 60 Hz and supports applications in VR/AR, rehabilitation, and fitness analysis, enabling robust, comfortable motion capture without camera-based infrastructure. This work lays the groundwork for scalable, ubiquitous MoCap from loose clothing by integrating sensor fusion, generative data synthesis, and physics-informed calibration to overcome displacement challenges.
Abstract
What if our clothes could capture our body motion accurately? This paper introduces Flexible Inertial Poser (FIP), a novel motion-capturing system using daily garments with two elbow-attached flex sensors and four Inertial Measurement Units (IMUs). To address the inevitable sensor displacements in loose wearables which degrade joint tracking accuracy significantly, we identify the distinct characteristics of the flex and inertial sensor displacements and develop a Displacement Latent Diffusion Model and a Physics-informed Calibrator to compensate for sensor displacements based on such observations, resulting in a substantial improvement in motion capture accuracy. We also introduce a Pose Fusion Predictor to enhance multimodal sensor fusion. Extensive experiments demonstrate that our method achieves robust performance across varying body shapes and motions, significantly outperforming SOTA IMU approaches with a 19.5% improvement in angular error, a 26.4% improvement in elbow angular error, and a 30.1% improvement in positional error. FIP opens up opportunities for ubiquitous human-computer interactions and diverse interactive applications such as Metaverse, rehabilitation, and fitness analysis.
