Table of Contents
Fetching ...

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Fangqiang Ding, Zhen Luo, Peijun Zhao, Chris Xiaoxuan Lu

TL;DR

milliFlow addresses the challenge of non-rigid human motion sensing with sparse mmWave radar data by estimating per-point scene flow between consecutive radar frames. The method combines multi-scale local features, global attention, and temporal information via a GRU, plus a constrained regression to produce plausible per-point displacements, trained with automatically generated pseudo labels from co-located RGB-D data. Automatic cross-modal labeling reduces labeling burden while maintaining supervision quality. Results show cm-level scene flow accuracy, real-time performance, and measurable improvements in HAR, HP, and HBPT, demonstrating the practical value of scene flow as a low-level motion cue for privacy-preserving radar sensing.

Abstract

Human motion sensing plays a crucial role in smart systems for decision-making, user interaction, and personalized services. Extensive research that has been conducted is predominantly based on cameras, whose intrusive nature limits their use in smart home applications. To address this, mmWave radars have gained popularity due to their privacy-friendly features. In this work, we propose milliFlow, a novel deep learning approach to estimate scene flow as complementary motion information for mmWave point cloud, serving as an intermediate level of features and directly benefiting downstream human motion sensing tasks. Experimental results demonstrate the superior performance of our method when compared with the competing approaches. Furthermore, by incorporating scene flow information, we achieve remarkable improvements in human activity recognition and human parsing and support human body part tracking. Code and dataset are available at https://github.com/Toytiny/milliFlow.

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

TL;DR

milliFlow addresses the challenge of non-rigid human motion sensing with sparse mmWave radar data by estimating per-point scene flow between consecutive radar frames. The method combines multi-scale local features, global attention, and temporal information via a GRU, plus a constrained regression to produce plausible per-point displacements, trained with automatically generated pseudo labels from co-located RGB-D data. Automatic cross-modal labeling reduces labeling burden while maintaining supervision quality. Results show cm-level scene flow accuracy, real-time performance, and measurable improvements in HAR, HP, and HBPT, demonstrating the practical value of scene flow as a low-level motion cue for privacy-preserving radar sensing.

Abstract

Human motion sensing plays a crucial role in smart systems for decision-making, user interaction, and personalized services. Extensive research that has been conducted is predominantly based on cameras, whose intrusive nature limits their use in smart home applications. To address this, mmWave radars have gained popularity due to their privacy-friendly features. In this work, we propose milliFlow, a novel deep learning approach to estimate scene flow as complementary motion information for mmWave point cloud, serving as an intermediate level of features and directly benefiting downstream human motion sensing tasks. Experimental results demonstrate the superior performance of our method when compared with the competing approaches. Furthermore, by incorporating scene flow information, we achieve remarkable improvements in human activity recognition and human parsing and support human body part tracking. Code and dataset are available at https://github.com/Toytiny/milliFlow.
Paper Structure (33 sections, 3 equations, 8 figures, 8 tables)

This paper contains 33 sections, 3 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: We propose milliFlow, a scene flow estimation module to provide an additional layer of point-wise motion information on top of the original mmWave radar point cloud in the conventional mmWave-based human motion sensing pipeline.
  • Figure 2: mmWave-based scene flow network architecture. The network takes consecutive radar point clouds as the input and outputs the scene flow in between.
  • Figure 3: Automatic scene flow labelling pipeline. With the help of the co-located RGB-D camera, we first label 3D human skeletons and then generate noisy pseudo scene flow labels with respect to the skeleton-based rigid-motion assumption.
  • Figure 4: Collection setup, test environment, subject activities and pseudo pose labels.
  • Figure 5: Qualitative scene flow results. radar points and scene flow vectors are projected onto the image and the colour red is used for the ground truth while blue for ours.
  • ...and 3 more figures