Table of Contents
Fetching ...

Automated Patient Positioning with Learned 3D Hand Gestures

Zhongpai Gao, Abhishek Sharma, Meng Zheng, Benjamin Planche, Terrence Chen, Ziyan Wu

TL;DR

This work tackles the bottleneck of manual patient positioning in medical imaging by introducing an automated, gesture-driven system that uses a ceiling-mounted camera to interpret technician intents. The authors present a multi-stage analytics pipeline comprising orientation- and association-aware hand detection, illumination-robust 2D landmark detection, and dual-modality 3D pose estimation to translate gestures into precise bed movements and region localization. Evaluations on real clinical data and the HaGRID dataset demonstrate high accuracy, robustness to low light and small hand regions, and real-time performance on edge hardware, indicating strong practical impact for MRI workflows and beyond. The approach promises improved throughput, reduced manual variability, and potential applicability to other scanners and interventional procedures.

Abstract

Positioning patients for scanning and interventional procedures is a critical task that requires high precision and accuracy. The conventional workflow involves manually adjusting the patient support to align the center of the target body part with the laser projector or other guiding devices. This process is not only time-consuming but also prone to inaccuracies. In this work, we propose an automated patient positioning system that utilizes a camera to detect specific hand gestures from technicians, allowing users to indicate the target patient region to the system and initiate automated positioning. Our approach relies on a novel multi-stage pipeline to recognize and interpret the technicians' gestures, translating them into precise motions of medical devices. We evaluate our proposed pipeline during actual MRI scanning procedures, using RGB-Depth cameras to capture the process. Results show that our system achieves accurate and precise patient positioning with minimal technician intervention. Furthermore, we validate our method on HaGRID, a large-scale hand gesture dataset, demonstrating its effectiveness in hand detection and gesture recognition.

Automated Patient Positioning with Learned 3D Hand Gestures

TL;DR

This work tackles the bottleneck of manual patient positioning in medical imaging by introducing an automated, gesture-driven system that uses a ceiling-mounted camera to interpret technician intents. The authors present a multi-stage analytics pipeline comprising orientation- and association-aware hand detection, illumination-robust 2D landmark detection, and dual-modality 3D pose estimation to translate gestures into precise bed movements and region localization. Evaluations on real clinical data and the HaGRID dataset demonstrate high accuracy, robustness to low light and small hand regions, and real-time performance on edge hardware, indicating strong practical impact for MRI workflows and beyond. The approach promises improved throughput, reduced manual variability, and potential applicability to other scanners and interventional procedures.

Abstract

Positioning patients for scanning and interventional procedures is a critical task that requires high precision and accuracy. The conventional workflow involves manually adjusting the patient support to align the center of the target body part with the laser projector or other guiding devices. This process is not only time-consuming but also prone to inaccuracies. In this work, we propose an automated patient positioning system that utilizes a camera to detect specific hand gestures from technicians, allowing users to indicate the target patient region to the system and initiate automated positioning. Our approach relies on a novel multi-stage pipeline to recognize and interpret the technicians' gestures, translating them into precise motions of medical devices. We evaluate our proposed pipeline during actual MRI scanning procedures, using RGB-Depth cameras to capture the process. Results show that our system achieves accurate and precise patient positioning with minimal technician intervention. Furthermore, we validate our method on HaGRID, a large-scale hand gesture dataset, demonstrating its effectiveness in hand detection and gesture recognition.
Paper Structure (17 sections, 2 equations, 6 figures, 4 tables)

This paper contains 17 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Illustration of the proposed end-to-end automated patient positioning workflow.
  • Figure 2: Overall analytics pipeline.
  • Figure 3: ROC curves of different protocols.
  • Figure 4: Qualitative results of: a) w/ and w/o three designs: orientation- and association-aware hand detection, augmented landmark detection, and dual-modality 3D hand pose and b) comparison of MediaPipe and our method.
  • Figure 5: Qualitative comparison with HaGRID kapitanov2024hagrid on our gesture benchmark. Note, our hand detection model can associate hands with the body. We use the same color bounding boxes and connect the top-left corners with the same color line to denote hand-body association.
  • ...and 1 more figures