Table of Contents
Fetching ...

In-Bed Pose Estimation: A Review

Ziya Ata Yazıcı, Sara Colantonio, Hazım Kemal Ekenel

TL;DR

This paper surveys in-bed pose estimation, addressing the challenge of predicting body joints when a person is covered by a blanket and privacy concerns limit visible RGB data. It inventories public datasets (Pressure-Sensing Mat, Mannequin In-Bed, BlanketSet, SLP, Patient MoCap) and taxonomizes methods into unimodal and multimodal approaches, highlighting privacy-preserving modalities such as LWIR, depth, and pressure maps. Key insights show that multimodal fusion improves robustness to occlusion and lighting variations, while privacy-focused strategies enable deployment in hospital and home environments; however, lack of standardized benchmarks and limited population diversity hinder cross-study comparability. The review identifies opportunities for future work, including incorporating additional modalities, developing lightweight latent representations to preserve privacy, and establishing diverse, representative datasets and common evaluation protocols to accelerate progress in this domain.

Abstract

Human pose estimation, the process of identifying joint positions in a person's body from images or videos, represents a widely utilized technology across diverse fields, including healthcare. One such healthcare application involves in-bed pose estimation, where the body pose of an individual lying under a blanket is analyzed. This task, for instance, can be used to monitor a person's sleep behavior and detect symptoms early for potential disease diagnosis in homes and hospitals. Several studies have utilized unimodal and multimodal methods to estimate in-bed human poses. The unimodal studies generally employ RGB images, whereas the multimodal studies use modalities including RGB, long-wavelength infrared, pressure map, and depth map. Multimodal studies have the advantage of using modalities in addition to RGB that might capture information useful to cope with occlusions. Moreover, some multimodal studies exclude RGB and, this way, better suit privacy preservation. To expedite advancements in this domain, we conduct a review of existing datasets and approaches. Our objectives are to show the limitations of the previous studies, current challenges, and provide insights for future works on the in-bed human pose estimation field.

In-Bed Pose Estimation: A Review

TL;DR

This paper surveys in-bed pose estimation, addressing the challenge of predicting body joints when a person is covered by a blanket and privacy concerns limit visible RGB data. It inventories public datasets (Pressure-Sensing Mat, Mannequin In-Bed, BlanketSet, SLP, Patient MoCap) and taxonomizes methods into unimodal and multimodal approaches, highlighting privacy-preserving modalities such as LWIR, depth, and pressure maps. Key insights show that multimodal fusion improves robustness to occlusion and lighting variations, while privacy-focused strategies enable deployment in hospital and home environments; however, lack of standardized benchmarks and limited population diversity hinder cross-study comparability. The review identifies opportunities for future work, including incorporating additional modalities, developing lightweight latent representations to preserve privacy, and establishing diverse, representative datasets and common evaluation protocols to accelerate progress in this domain.

Abstract

Human pose estimation, the process of identifying joint positions in a person's body from images or videos, represents a widely utilized technology across diverse fields, including healthcare. One such healthcare application involves in-bed pose estimation, where the body pose of an individual lying under a blanket is analyzed. This task, for instance, can be used to monitor a person's sleep behavior and detect symptoms early for potential disease diagnosis in homes and hospitals. Several studies have utilized unimodal and multimodal methods to estimate in-bed human poses. The unimodal studies generally employ RGB images, whereas the multimodal studies use modalities including RGB, long-wavelength infrared, pressure map, and depth map. Multimodal studies have the advantage of using modalities in addition to RGB that might capture information useful to cope with occlusions. Moreover, some multimodal studies exclude RGB and, this way, better suit privacy preservation. To expedite advancements in this domain, we conduct a review of existing datasets and approaches. Our objectives are to show the limitations of the previous studies, current challenges, and provide insights for future works on the in-bed human pose estimation field.
Paper Structure (12 sections, 4 figures, 2 tables)

This paper contains 12 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Sample pressure maps and estimated poses from the Pressure-Sensing Mat Dataset clever20183d.
  • Figure 2: Samples images from the Mannequin In-Bed Datasetliu2019bed in two modalities: LWIR and RGB modalities from left to right.
  • Figure 3: Samples images from the SLP datasetliu2022simultaneously in four modalities: RGB, depth map, LWIR, and pressure map from left to right.
  • Figure 4: Non-occluded and occluded sample depth maps from the Patient MoCap Datasetachilles2016patient.