Table of Contents
Fetching ...

Why Learn What Physics Already Knows? Realizing Agile mmWave-based Human Pose Estimation via Physics-Guided Preprocessing

Shuntian Zheng, Jiaqi Li, Minzhe Ni, Xiaoman Lu, Yu Guan

TL;DR

This work revisits millimeter-wave (mmWave) human pose estimation (HPE) from a signal preprocessing perspective and introduces processing modules that explicitly model mmWave's inter-dimensional correlations and human kinematics.

Abstract

We revisit millimeter-wave (mmWave) human pose estimation (HPE) from a signal preprocessing perspective. A single mmWave frame provides structured dimensions that map directly to human geometry and motion: range, angle, and Doppler, offering pose-aligned cues that are not explicitly present in RGB images. However, recent mmWave-based HPE systems require more parameters and compute resources yet yield lower estimation accuracy than vision baselines. We attribute this to preprocessing modules: most systems rely on data-driven modules to estimate phenomena that are already well-defined by mmWave sensing physics, whereas human pose could be captured more efficiently with explicit physical priors. To this end, we introduce processing modules that explicitly model mmWave's inter-dimensional correlations and human kinematics. Our design (1) couples range and angle to preserve spatial human structure, (2) leverages Doppler to retain human motion continuity, and (3) applies multi-scale fusion aligned with the human body. A lightweight MLP is involved as the regressor. In experiments, this framework reduces the number of parameters by 55.7-88.9% on the HPE task relative to existing mmWave baselines while maintaining competitive accuracy. Meanwhile, its lightweight nature enables real-time Raspberry Pi deployment. Code and deployment artifacts will be released upon acceptance.

Why Learn What Physics Already Knows? Realizing Agile mmWave-based Human Pose Estimation via Physics-Guided Preprocessing

TL;DR

This work revisits millimeter-wave (mmWave) human pose estimation (HPE) from a signal preprocessing perspective and introduces processing modules that explicitly model mmWave's inter-dimensional correlations and human kinematics.

Abstract

We revisit millimeter-wave (mmWave) human pose estimation (HPE) from a signal preprocessing perspective. A single mmWave frame provides structured dimensions that map directly to human geometry and motion: range, angle, and Doppler, offering pose-aligned cues that are not explicitly present in RGB images. However, recent mmWave-based HPE systems require more parameters and compute resources yet yield lower estimation accuracy than vision baselines. We attribute this to preprocessing modules: most systems rely on data-driven modules to estimate phenomena that are already well-defined by mmWave sensing physics, whereas human pose could be captured more efficiently with explicit physical priors. To this end, we introduce processing modules that explicitly model mmWave's inter-dimensional correlations and human kinematics. Our design (1) couples range and angle to preserve spatial human structure, (2) leverages Doppler to retain human motion continuity, and (3) applies multi-scale fusion aligned with the human body. A lightweight MLP is involved as the regressor. In experiments, this framework reduces the number of parameters by 55.7-88.9% on the HPE task relative to existing mmWave baselines while maintaining competitive accuracy. Meanwhile, its lightweight nature enables real-time Raspberry Pi deployment. Code and deployment artifacts will be released upon acceptance.
Paper Structure (52 sections, 15 equations, 7 figures, 11 tables)

This paper contains 52 sections, 15 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Existing mmWave-based HPE process (top) and our solution (bottom).
  • Figure 2: The physical characteristics of mmWave’s three dimensions.
  • Figure 3: Framework processing flow, which includes: (a) Spatial Structure Preservation based on range-angle dimensions; (b) Motion Continuity Preservation based on Doppler dimension; (c) Hierarchical Multi-Scale Fusion for human body, and (d) Pose Regression Network.
  • Figure 4: CPU usage distribution of all methods
  • Figure 5: Throughput-error trade-off on Raspberry Pi across 5 configurations.
  • ...and 2 more figures