Table of Contents
Fetching ...

OPPH: A Vision-Based Operator for Measuring Body Movements for Personal Healthcare

Chen Long-fei, Subramanian Ramamoorthy, Robert B Fisher

TL;DR

OPPH addresses the need for reliable vision-based body motion estimation in healthcare by introducing a multi-stage operator that gates frame-level movement estimates with a binary motion-detection state derived from frame differences and a body mask. It integrates with both pose-based and optical-flow methods, notably RAFT, to suppress real-world noise and preserve long-term movement trends, achieving significant RMSE reductions on real-world motionless data (HuMoLs) and competitive gains on real-world and synthetic datasets (JHMDB, Surreal). The approach maintains high correlation with ground-truth motion over time ($r\approx$0.92--0.99 in key tests) and runs in real time (~$13.33$ fps) on standard hardware, making it suitable for crisis detection and chronic-condition monitoring in healthcare settings. Overall, OPPH offers a practical denoising and gating mechanism that enhances healthcare-oriented motion analysis without sacrificing active-motion accuracy, enabling robust long-term monitoring and timely intervention.

Abstract

Vision-based motion estimation methods show promise in accurately and unobtrusively estimating human body motion for healthcare purposes. However, these methods are not specifically designed for healthcare purposes and face challenges in real-world applications. Human pose estimation methods often lack the accuracy needed for detecting fine-grained, subtle body movements, while optical flow-based methods struggle with poor lighting conditions and unseen real-world data. These issues result in human body motion estimation errors, particularly during critical medical situations where the body is motionless, such as during unconsciousness. To address these challenges and improve the accuracy of human body motion estimation for healthcare purposes, we propose the OPPH operator designed to enhance current vision-based motion estimation methods. This operator, which considers human body movement and noise properties, functions as a multi-stage filter. Results tested on two real-world and one synthetic human motion dataset demonstrate that the operator effectively removes real-world noise, significantly enhances the detection of motionless states, maintains the accuracy of estimating active body movements, and maintains long-term body movement trends. This method could be beneficial for analyzing both critical medical events and chronic medical conditions.

OPPH: A Vision-Based Operator for Measuring Body Movements for Personal Healthcare

TL;DR

OPPH addresses the need for reliable vision-based body motion estimation in healthcare by introducing a multi-stage operator that gates frame-level movement estimates with a binary motion-detection state derived from frame differences and a body mask. It integrates with both pose-based and optical-flow methods, notably RAFT, to suppress real-world noise and preserve long-term movement trends, achieving significant RMSE reductions on real-world motionless data (HuMoLs) and competitive gains on real-world and synthetic datasets (JHMDB, Surreal). The approach maintains high correlation with ground-truth motion over time (0.92--0.99 in key tests) and runs in real time (~ fps) on standard hardware, making it suitable for crisis detection and chronic-condition monitoring in healthcare settings. Overall, OPPH offers a practical denoising and gating mechanism that enhances healthcare-oriented motion analysis without sacrificing active-motion accuracy, enabling robust long-term monitoring and timely intervention.

Abstract

Vision-based motion estimation methods show promise in accurately and unobtrusively estimating human body motion for healthcare purposes. However, these methods are not specifically designed for healthcare purposes and face challenges in real-world applications. Human pose estimation methods often lack the accuracy needed for detecting fine-grained, subtle body movements, while optical flow-based methods struggle with poor lighting conditions and unseen real-world data. These issues result in human body motion estimation errors, particularly during critical medical situations where the body is motionless, such as during unconsciousness. To address these challenges and improve the accuracy of human body motion estimation for healthcare purposes, we propose the OPPH operator designed to enhance current vision-based motion estimation methods. This operator, which considers human body movement and noise properties, functions as a multi-stage filter. Results tested on two real-world and one synthetic human motion dataset demonstrate that the operator effectively removes real-world noise, significantly enhances the detection of motionless states, maintains the accuracy of estimating active body movements, and maintains long-term body movement trends. This method could be beneficial for analyzing both critical medical events and chronic medical conditions.
Paper Structure (8 sections, 3 equations, 4 figures, 2 tables)

This paper contains 8 sections, 3 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The estimation error of 2D body motion speed was evaluated for two real-world scenario datasets: one with only 'active' body movement (JHMDB dataset jh, containing 21 action classes), and another with 'no' body movement (Human MotionLess dataset, HuMoLs, consisting of 67 videos from 18 subjects). The Root Mean Square Error (RMSE) is stacked across all classes (or videos). Four motion estimation methods (pose-based VitPose vit, HRNet hr, and optical flow-based RAFT raft, Farneback farneback) were compared to the ground truth. Pose-based methods showed larger errors in both motion and motionless scenarios compared to optical flow-based methods.
  • Figure 2: The proposed OPPH operator for enhancing human body motion estimation consists of thresholding $\theta$, an $n \times n$ spatial filter, and a $1 \times m$ temporal filter. First, three-color-channel thresholding is applied to the difference image derived from two consecutive RGB images to identify large-change pixels. The large-change image is then masked by a pre-obtained human body mask and processed through a spatial filter to remove isolated noise. Next, the prominent motion image is compressed into a single binary variable $S_t$, which is temporally filtered to produce $S_t'$, indicating whether frame $t$ contains true body movement. Finally, the filtered signal $S_t'$ is used to gate the original body motion speed estimation (derived from optical flow magnitudes or pose displacements and the body mask) to obtain the final body motion speed estimation result for each frame.
  • Figure 3: Accuracy of 2D human body motion speed estimation. (a) Motion speed estimation error (mean RMSE across classes/videos) with all datasets and four original motion estimation methods (Ori) and with OPPH (+OP). (b) The median RMSE (outliers excluded) for the same comparison as in (a). (c) Comparison of speed distributions between the ground truth motion speed and the estimated speed by the best method, RAFT, and RAFT with OPPH (+OP). (d) Accuracy (mean RMSE) of OPPH compared with other types of filters on RAFT. OPPH (+OP) was compared with the Median filter (+med), Bilateral filter (+bil), Total Variation (+TV), and Kalman Filter (+kal).
  • Figure 4: Long-term correlation between the estimated 2D body motion and the ground truth motion. The videos in the two motion datasets are stacked into longer time windows, and the mean motion speed within each window is calculated (pixels per time window). Each plot averages the data for a different period, as reported under the plot. The horizontal axis is the index of the period. The results show that OPPH (+OP) combined with the optical flow-based method, RAFT, can still effectively capture the trend of body movement changes. From the distribution in \ref{['fig:accuracies']}, there is an underestimation of fast movement in JHMDB by RAFT, and the OPPH operator further suppressed the RAFT estimated values in low-speed ranges. This explains the offset between the curves for JHMDB, however note that there is still a high correlation. For Surreal, RAFT underestimated the middle-range speed compared to the ground truth but overestimated the low-range speed; OPPH does not have any substantial effect on the estimated value.