Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring
Mario Francisco Munoz, Hoang Vu Huy, Thanh-Dung Le
TL;DR
This work tackles occlusion segmentation in pediatric remote patient monitoring within PICUs under limited training data. It introduces a hybrid pipeline that fuses CNN-based semantic segmentation (DeepLabV3+) with the Segment Anything Model (SAM) through a soft-voting fusion mechanism (SOSS) to produce accurate occlusion masks. Evaluated on a real-world PICU dataset collected with a Kinect V2, the approach achieves IoU sekitar 85% and an overall performance uplift (e.g., accuracy ~92.5%, F1 ~92.0%) over a CNN baseline, with an average improvement of 2.75 percentage points across metrics. The method enhances the reliability of RPM in clinical settings, contributing to safer, more accurate remote monitoring for pediatric patients, while also highlighting practical limitations such as SAM's inference speed and prompting sensitivity that guide future refinements.
Abstract
Remote patient monitoring has emerged as a prominent non-invasive method, using digital technologies and computer vision (CV) to replace traditional invasive monitoring. While neonatal and pediatric departments embrace this approach, Pediatric Intensive Care Units (PICUs) face the challenge of occlusions hindering accurate image analysis and interpretation. \textit{Objective}: In this study, we propose a hybrid approach to effectively segment common occlusions encountered in remote monitoring applications within PICUs. Our approach centers on creating a deep-learning pipeline for limited training data scenarios. \textit{Methods}: First, a combination of the well-established Google DeepLabV3+ segmentation model with the transformer-based Segment Anything Model (SAM) is devised for occlusion segmentation mask proposal and refinement. We then train and validate this pipeline using a small dataset acquired from real-world PICU settings with a Microsoft Kinect camera, achieving an Intersection-over-Union (IoU) metric of 85\%. \textit{Results}: Both quantitative and qualitative analyses underscore the effectiveness of our proposed method. The proposed framework yields an overall classification performance with 92.5\% accuracy, 93.8\% recall, 90.3\% precision, and 92.0\% F1-score. Consequently, the proposed method consistently improves the predictions across all metrics, with an average of 2.75\% gain in performance compared to the baseline CNN-based framework. \textit{Conclusions}: Our proposed hybrid approach significantly enhances the segmentation of occlusions in remote patient monitoring within PICU settings. This advancement contributes to improving the quality of care for pediatric patients, addressing a critical need in clinical practice by ensuring more accurate and reliable remote monitoring.
