Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets

Zhuoxuan Peng; Boan Zhu; Xingjian Zhang; Wenying Li; S. -H. Gary Chan

Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets

Zhuoxuan Peng, Boan Zhu, Xingjian Zhang, Wenying Li, S. -H. Gary Chan

Abstract

Current mmWave datasets for human pose estimation (HPE) are scarce and lack diversity in both point cloud (PC) attributes and human poses, severely hampering the generalization ability of their trained models. On the other hand, unlabeled mmWave HPE data and diverse LiDAR HPE datasets are readily available. We propose EMDUL, a novel approach to expand the volume and diversity of an existing mmWave dataset using unlabeled mmWave data and a LiDAR dataset. EMDUL trains a pseudo-label estimator to annotate the unlabeled mmWave data and is able to convert, or translate, a given annotated LiDAR PC to its mmWave counterpart. Expanded with both LiDAR-converted and pseudo-labeled mmWave PCs, our mmWave dataset significantly boosts the performance and generalization ability of all our HPE models, with substantial 15.1% and 18.9% error reductions for in-domain and out-of-domain settings, respectively.

Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets

Abstract

Paper Structure (33 sections, 7 equations, 8 figures, 11 tables)

This paper contains 33 sections, 7 equations, 8 figures, 11 tables.

Introduction
Related Works
mmWave-based Human Pose Estimation
Data Expansion or Augmentation for mmWave Datasets
Semi-Supervised Learning Approaches
Problem Formulation and EMDUL Overview
Problem Formulation
EMDUL Overview
Training on Expanded Dataset
Pseudo-labeling of Unlabeled mmWave Data
Unsupervised Temporal Consistency Loss (UTCL)
Training of Pseudo-label Estimator
Converting LiDAR Datasets to mmWave Point Clouds
PC Conversion Pipeline
Flow-based Point Filtering (FPF)
...and 18 more sections

Figures (8)

Figure 1: Examples illustrating the effect of dataset expansion. (a) Samples from an mmWave HPE training dataset. (b) Samples from a LiDAR dataset with richer pose diversity used for dataset expansion; (c) An mmWave PC from an unseen scenario. (d) The ground-truth skeleton. (e) The predicted skeleton of SOTA P4T fanPoint4DTransformer2021 without expansion. (f) The predicted skeleton of P4T trained on EMDUL-expanded dataset. Joints are colored red for errors $>10\,\text{cm}$ and green otherwise. EMDUL achieves stronger generalization ability than the baseline P4T.
Figure 2: The overview of EMDUL integrating both PC conversion and pseudo-labeling modules.
Figure 3: Illustration of the motion-detection mechanism in mmWave radar using an MM-Fi sample. Joints with high flow (yellow) lie close to detected points, while low-flow joints (dark blue) have no nearby points.
Figure 4: Step-by-step visualization of the point-cloud (PC) conversion pipeline Blue joints have lower flow magnitudes and yellow joints higher ones.
Figure 5: Sample point clouds from different mmWave and LiDAR HPE datasets and the standardized 15-keypoint skeleton structure used in this paper.
...and 3 more figures

Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets

Abstract

Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets

Authors

Abstract

Table of Contents

Figures (8)