Table of Contents
Fetching ...

PressTrack-HMR: Pressure-Based Top-Down Multi-Person Global Human Mesh Recovery

Jiayue Yuan, Fangting Xie, Guangwen Ouyang, Changhai Ma, Ziyu Wu, Heyu Ding, Quan Wan, Yi Ke, Yuchen Wu, Xiaohui Cai

TL;DR

PressTrack-HMR addresses privacy-preserving multi-person global human mesh recovery using tactile pressure mats. It introduces a two-stage approach: PressTrack for robust, per-person pressure- footprint tracking via detection and UoE-based inter-frame association, and a Transformer-based HMR module that regresses SMPL parameters from temporal single-person pressure maps. The work also provides the MIP dataset to enable pressure-based multi-person motion analysis. End-to-end evaluation shows competitive multi-person mesh recovery with $MPJPE=89.2\ \mathrm{mm}$ and $WA\text{-}MPJPE_{100}=112.6\ \mathrm{mm}$, and the method achieves strong footprint tracking metrics ($\text{MOTA}=93.6\%$, $\text{MOTP}=94.8\%$), illustrating the potential of tactile mats for privacy-preserving crowd analysis and motion capture.

Abstract

Multi-person global human mesh recovery (HMR) is crucial for understanding crowd dynamics and interactions. Traditional vision-based HMR methods sometimes face limitations in real-world scenarios due to mutual occlusions, insufficient lighting, and privacy concerns. Human-floor tactile interactions offer an occlusion-free and privacy-friendly alternative for capturing human motion. Existing research indicates that pressure signals acquired from tactile mats can effectively estimate human pose in single-person scenarios. However, when multiple individuals walk randomly on the mat simultaneously, how to distinguish intermingled pressure signals generated by different persons and subsequently acquire individual temporal pressure data remains a pending challenge for extending pressure-based HMR to the multi-person situation. In this paper, we present \textbf{PressTrack-HMR}, a top-down pipeline that recovers multi-person global human meshes solely from pressure signals. This pipeline leverages a tracking-by-detection strategy to first identify and segment each individual's pressure signal from the raw pressure data, and subsequently performs HMR for each extracted individual signal. Furthermore, we build a multi-person interaction pressure dataset \textbf{MIP}, which facilitates further research into pressure-based human motion analysis in multi-person scenarios. Experimental results demonstrate that our method excels in multi-person HMR using pressure data, with 89.2 $mm$ MPJPE and 112.6 $mm$ WA-MPJPE$_{100}$, and these showcase the potential of tactile mats for ubiquitous, privacy-preserving multi-person action recognition. Our dataset & code are available at https://github.com/Jiayue-Yuan/PressTrack-HMR.

PressTrack-HMR: Pressure-Based Top-Down Multi-Person Global Human Mesh Recovery

TL;DR

PressTrack-HMR addresses privacy-preserving multi-person global human mesh recovery using tactile pressure mats. It introduces a two-stage approach: PressTrack for robust, per-person pressure- footprint tracking via detection and UoE-based inter-frame association, and a Transformer-based HMR module that regresses SMPL parameters from temporal single-person pressure maps. The work also provides the MIP dataset to enable pressure-based multi-person motion analysis. End-to-end evaluation shows competitive multi-person mesh recovery with and , and the method achieves strong footprint tracking metrics (, ), illustrating the potential of tactile mats for privacy-preserving crowd analysis and motion capture.

Abstract

Multi-person global human mesh recovery (HMR) is crucial for understanding crowd dynamics and interactions. Traditional vision-based HMR methods sometimes face limitations in real-world scenarios due to mutual occlusions, insufficient lighting, and privacy concerns. Human-floor tactile interactions offer an occlusion-free and privacy-friendly alternative for capturing human motion. Existing research indicates that pressure signals acquired from tactile mats can effectively estimate human pose in single-person scenarios. However, when multiple individuals walk randomly on the mat simultaneously, how to distinguish intermingled pressure signals generated by different persons and subsequently acquire individual temporal pressure data remains a pending challenge for extending pressure-based HMR to the multi-person situation. In this paper, we present \textbf{PressTrack-HMR}, a top-down pipeline that recovers multi-person global human meshes solely from pressure signals. This pipeline leverages a tracking-by-detection strategy to first identify and segment each individual's pressure signal from the raw pressure data, and subsequently performs HMR for each extracted individual signal. Furthermore, we build a multi-person interaction pressure dataset \textbf{MIP}, which facilitates further research into pressure-based human motion analysis in multi-person scenarios. Experimental results demonstrate that our method excels in multi-person HMR using pressure data, with 89.2 MPJPE and 112.6 WA-MPJPE, and these showcase the potential of tactile mats for ubiquitous, privacy-preserving multi-person action recognition. Our dataset & code are available at https://github.com/Jiayue-Yuan/PressTrack-HMR.

Paper Structure

This paper contains 14 sections, 5 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: (a) A sequence of consecutive raw pressure maps, displayed from left to right at 0.27-second intervals. (b) The same pressure maps with different colored detection boxes representing the footprints of different individuals.
  • Figure 2: (a) Two separated footprint regions caused by shoe sole. (b) Dynamics of pressure maps during human locomotion. Dashed large boxes indicate a two-footed contact scenario, solid small boxes indicate single-footed contact.
  • Figure 3: The framework of our proposed pipeline PressTrack-HMR.
  • Figure 4: Object detection label generation process. (a) Initial discrete pressure regions; (b) Project 2D toe-base and ankle joints for different individuals; (c) Assign discrete regions to individuals based on geometric proximity; (d) Merge regions assigned to the same individual into one box.
  • Figure 5: Architecture of our human pose estimation model.
  • ...and 2 more figures