Testing of Deep Learning Model in Real World Clinical Setting: A Case Study in Obstetric Ultrasound
Chun Kit Wong, Mary Ngo, Manxi Lin, Zahra Bashir, Amihai Heen, Morten Bo Søndergaard Svendsen, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen
TL;DR
The paper addresses the gap between research validation and real-world clinical use by proposing a generic framework for deploying image-based AI in clinics. It demonstrates the framework with a Progressive Concept Bottleneck Model (PCBM) for fetal ultrasound standard-plane detection, evaluated in real-time sessions with novices and experts. Real-world feedback reveals potential benefits but highlights the need for navigational guidance and smoother workflow integration. Early deployment enables rapid, user-driven refinements and emphasizes acquisition challenges, illustrating the framework's practical value for translating research into clinical practice.
Abstract
Despite the rapid development of AI models in medical image analysis, their validation in real-world clinical settings remains limited. To address this, we introduce a generic framework designed for deploying image-based AI models in such settings. Using this framework, we deployed a trained model for fetal ultrasound standard plane detection, and evaluated it in real-time sessions with both novice and expert users. Feedback from these sessions revealed that while the model offers potential benefits to medical practitioners, the need for navigational guidance was identified as a key area for improvement. These findings underscore the importance of early deployment of AI models in real-world settings, leading to insights that can guide the refinement of the model and system based on actual user feedback.
