Table of Contents
Fetching ...

Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring

Mario Francisco Munoz, Hoang Vu Huy, Thanh-Dung Le

TL;DR

This work tackles occlusion segmentation in pediatric remote patient monitoring within PICUs under limited training data. It introduces a hybrid pipeline that fuses CNN-based semantic segmentation (DeepLabV3+) with the Segment Anything Model (SAM) through a soft-voting fusion mechanism (SOSS) to produce accurate occlusion masks. Evaluated on a real-world PICU dataset collected with a Kinect V2, the approach achieves IoU sekitar 85% and an overall performance uplift (e.g., accuracy ~92.5%, F1 ~92.0%) over a CNN baseline, with an average improvement of 2.75 percentage points across metrics. The method enhances the reliability of RPM in clinical settings, contributing to safer, more accurate remote monitoring for pediatric patients, while also highlighting practical limitations such as SAM's inference speed and prompting sensitivity that guide future refinements.

Abstract

Remote patient monitoring has emerged as a prominent non-invasive method, using digital technologies and computer vision (CV) to replace traditional invasive monitoring. While neonatal and pediatric departments embrace this approach, Pediatric Intensive Care Units (PICUs) face the challenge of occlusions hindering accurate image analysis and interpretation. \textit{Objective}: In this study, we propose a hybrid approach to effectively segment common occlusions encountered in remote monitoring applications within PICUs. Our approach centers on creating a deep-learning pipeline for limited training data scenarios. \textit{Methods}: First, a combination of the well-established Google DeepLabV3+ segmentation model with the transformer-based Segment Anything Model (SAM) is devised for occlusion segmentation mask proposal and refinement. We then train and validate this pipeline using a small dataset acquired from real-world PICU settings with a Microsoft Kinect camera, achieving an Intersection-over-Union (IoU) metric of 85\%. \textit{Results}: Both quantitative and qualitative analyses underscore the effectiveness of our proposed method. The proposed framework yields an overall classification performance with 92.5\% accuracy, 93.8\% recall, 90.3\% precision, and 92.0\% F1-score. Consequently, the proposed method consistently improves the predictions across all metrics, with an average of 2.75\% gain in performance compared to the baseline CNN-based framework. \textit{Conclusions}: Our proposed hybrid approach significantly enhances the segmentation of occlusions in remote patient monitoring within PICU settings. This advancement contributes to improving the quality of care for pediatric patients, addressing a critical need in clinical practice by ensuring more accurate and reliable remote monitoring.

Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring

TL;DR

This work tackles occlusion segmentation in pediatric remote patient monitoring within PICUs under limited training data. It introduces a hybrid pipeline that fuses CNN-based semantic segmentation (DeepLabV3+) with the Segment Anything Model (SAM) through a soft-voting fusion mechanism (SOSS) to produce accurate occlusion masks. Evaluated on a real-world PICU dataset collected with a Kinect V2, the approach achieves IoU sekitar 85% and an overall performance uplift (e.g., accuracy ~92.5%, F1 ~92.0%) over a CNN baseline, with an average improvement of 2.75 percentage points across metrics. The method enhances the reliability of RPM in clinical settings, contributing to safer, more accurate remote monitoring for pediatric patients, while also highlighting practical limitations such as SAM's inference speed and prompting sensitivity that guide future refinements.

Abstract

Remote patient monitoring has emerged as a prominent non-invasive method, using digital technologies and computer vision (CV) to replace traditional invasive monitoring. While neonatal and pediatric departments embrace this approach, Pediatric Intensive Care Units (PICUs) face the challenge of occlusions hindering accurate image analysis and interpretation. \textit{Objective}: In this study, we propose a hybrid approach to effectively segment common occlusions encountered in remote monitoring applications within PICUs. Our approach centers on creating a deep-learning pipeline for limited training data scenarios. \textit{Methods}: First, a combination of the well-established Google DeepLabV3+ segmentation model with the transformer-based Segment Anything Model (SAM) is devised for occlusion segmentation mask proposal and refinement. We then train and validate this pipeline using a small dataset acquired from real-world PICU settings with a Microsoft Kinect camera, achieving an Intersection-over-Union (IoU) metric of 85\%. \textit{Results}: Both quantitative and qualitative analyses underscore the effectiveness of our proposed method. The proposed framework yields an overall classification performance with 92.5\% accuracy, 93.8\% recall, 90.3\% precision, and 92.0\% F1-score. Consequently, the proposed method consistently improves the predictions across all metrics, with an average of 2.75\% gain in performance compared to the baseline CNN-based framework. \textit{Conclusions}: Our proposed hybrid approach significantly enhances the segmentation of occlusions in remote patient monitoring within PICU settings. This advancement contributes to improving the quality of care for pediatric patients, addressing a critical need in clinical practice by ensuring more accurate and reliable remote monitoring.
Paper Structure (17 sections, 1 equation, 9 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 1 equation, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: An illustrative image of a PICU patient with different occlusions in our CHU Sainte Justine's database.
  • Figure 2: Our proposed pipeline (SOSS) for occlusion segmentation. The input image is fed to both the top branch and bottom branch simultaneously. Top branch: our DeepLabV3+-based network segments the input image and produces a semantic (occlusion) mask proposal. Bottom branch: the SAM-based generator produces a partitioning of the input image without corresponding semantic labels. Both kinds of masks are then fused using our proposed confidence-based soft voting mechanism for the final occlusion segmentation mask. This aims to add semantic information to the SAM branch while simultaneously improving the segmentation quality of the DeepLab branch.
  • Figure 3: A summary of our workflow, including 4 steps: labeling, data augmentation, model fine-tuning, and prediction.
  • Figure 4: Relative size distribution of various occlusion types. Each bar refers to each occlusion type averaged over all the images with that type of occlusion. The average occlusion displays the average area that any occlusion could occupy in an image.
  • Figure 5: A sampled image of a patient in the PICU with varied occlusions (left) and their corresponding ground truth annotations (right) in our dataset.
  • ...and 4 more figures