Table of Contents
Fetching ...

SUPER: Seated Upper Body Pose Estimation using mmWave Radars

Bo Zhang, Zimeng Zhou, Boyu Jiang, Rong Zheng

TL;DR

This work addresses seated upper-body pose estimation using mmWave radar by introducing SUPER, a dual-r radar framework that fuses intensity and Doppler information from two closely spaced sensors oriented perpendicularly. The system builds fine-grained IPC and DPC representations, processes them with a lightweight PointNet/PointNet++-based backbone augmented with LSTMs, and outputs SMPL-based upper-body poses while freezing lower-body joints. Across leave-one-subject-out evaluations on 10 participants, SUPER significantly outperforms a state-of-the-art mmMesh baseline in MPJPE, PA-MPJPE, and PCK@15mm, validating the benefit of dual-radar fusion and multi-modal point clouds for SUB-HPE. A demonstration on hand-object interaction further indicates practical utility for downstream tasks, such as driving and interface control, under privacy-preserving mmWave sensing. The dataset and implementation details provide a solid foundation for further exploration of SUB-HPE with mmWave radars in constrained environments.

Abstract

In industrial countries, adults spend a considerable amount of time sedentary each day at work, driving and during activities of daily living. Characterizing the seated upper body human poses using mmWave radars is an important, yet under-studied topic with many applications in human-machine interaction, transportation and road safety. In this work, we devise SUPER, a framework for seated upper body human pose estimation that utilizes dual-mmWave radars in close proximity. A novel masking algorithm is proposed to coherently fuse data from the radars to generate intensity and Doppler point clouds with complementary information for high-motion but small radar cross section areas (e.g., upper extremities) and low-motion but large RCS areas (e.g. torso). A lightweight neural network extracts both global and local features of upper body and output pose parameters for the Skinned Multi-Person Linear (SMPL) model. Extensive leave-one-subject-out experiments on various motion sequences from multiple subjects show that SUPER outperforms a state-of-the-art baseline method by 30 -- 184%. We also demonstrate its utility in a simple downstream task for hand-object interaction.

SUPER: Seated Upper Body Pose Estimation using mmWave Radars

TL;DR

This work addresses seated upper-body pose estimation using mmWave radar by introducing SUPER, a dual-r radar framework that fuses intensity and Doppler information from two closely spaced sensors oriented perpendicularly. The system builds fine-grained IPC and DPC representations, processes them with a lightweight PointNet/PointNet++-based backbone augmented with LSTMs, and outputs SMPL-based upper-body poses while freezing lower-body joints. Across leave-one-subject-out evaluations on 10 participants, SUPER significantly outperforms a state-of-the-art mmMesh baseline in MPJPE, PA-MPJPE, and PCK@15mm, validating the benefit of dual-radar fusion and multi-modal point clouds for SUB-HPE. A demonstration on hand-object interaction further indicates practical utility for downstream tasks, such as driving and interface control, under privacy-preserving mmWave sensing. The dataset and implementation details provide a solid foundation for further exploration of SUB-HPE with mmWave radars in constrained environments.

Abstract

In industrial countries, adults spend a considerable amount of time sedentary each day at work, driving and during activities of daily living. Characterizing the seated upper body human poses using mmWave radars is an important, yet under-studied topic with many applications in human-machine interaction, transportation and road safety. In this work, we devise SUPER, a framework for seated upper body human pose estimation that utilizes dual-mmWave radars in close proximity. A novel masking algorithm is proposed to coherently fuse data from the radars to generate intensity and Doppler point clouds with complementary information for high-motion but small radar cross section areas (e.g., upper extremities) and low-motion but large RCS areas (e.g. torso). A lightweight neural network extracts both global and local features of upper body and output pose parameters for the Skinned Multi-Person Linear (SMPL) model. Extensive leave-one-subject-out experiments on various motion sequences from multiple subjects show that SUPER outperforms a state-of-the-art baseline method by 30 -- 184%. We also demonstrate its utility in a simple downstream task for hand-object interaction.
Paper Structure (20 sections, 7 equations, 11 figures, 6 tables)

This paper contains 20 sections, 7 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: The estimated skeleton model from SUPER vs. ground truth when a subject raises her/his hand up while seating. The blue circle markers stand for the estimated skeleton model, and the red plus markers are the corresponding ground truth.
  • Figure 2: A 2-Dimensional MIMO antenna array for IWR6843ISK radar. The separation $d$ equals half wavelength.
  • Figure 3: The system diagram of SUPER. New processing blocks introduced in this paper are highlighted in orange, and intermediate data flows are highlighted in blue.
  • Figure 4: Generation of dense point clouds from raw radar data. One intensity point cloud and one Doppler point cloud are produced for each radar separately.
  • Figure 5: Generation of fine-grained point clouds by fusing and sampling dense point clouds from the two radars.
  • ...and 6 more figures