Automated Patient Positioning with Learned 3D Hand Gestures
Zhongpai Gao, Abhishek Sharma, Meng Zheng, Benjamin Planche, Terrence Chen, Ziyan Wu
TL;DR
This work tackles the bottleneck of manual patient positioning in medical imaging by introducing an automated, gesture-driven system that uses a ceiling-mounted camera to interpret technician intents. The authors present a multi-stage analytics pipeline comprising orientation- and association-aware hand detection, illumination-robust 2D landmark detection, and dual-modality 3D pose estimation to translate gestures into precise bed movements and region localization. Evaluations on real clinical data and the HaGRID dataset demonstrate high accuracy, robustness to low light and small hand regions, and real-time performance on edge hardware, indicating strong practical impact for MRI workflows and beyond. The approach promises improved throughput, reduced manual variability, and potential applicability to other scanners and interventional procedures.
Abstract
Positioning patients for scanning and interventional procedures is a critical task that requires high precision and accuracy. The conventional workflow involves manually adjusting the patient support to align the center of the target body part with the laser projector or other guiding devices. This process is not only time-consuming but also prone to inaccuracies. In this work, we propose an automated patient positioning system that utilizes a camera to detect specific hand gestures from technicians, allowing users to indicate the target patient region to the system and initiate automated positioning. Our approach relies on a novel multi-stage pipeline to recognize and interpret the technicians' gestures, translating them into precise motions of medical devices. We evaluate our proposed pipeline during actual MRI scanning procedures, using RGB-Depth cameras to capture the process. Results show that our system achieves accurate and precise patient positioning with minimal technician intervention. Furthermore, we validate our method on HaGRID, a large-scale hand gesture dataset, demonstrating its effectiveness in hand detection and gesture recognition.
