Goal-conditioned reinforcement learning for ultrasound navigation guidance
Abdoul Aziz Amadou, Vivek Singh, Florin C. Ghesu, Young-Ho Kim, Laura Stanciulescu, Harshitha P. Sai, Puneet Sharma, Alistair Young, Ronak Rajani, Kawal Rhode
TL;DR
The paper tackles training efficiency and generalization for transesophageal echocardiography (TEE) navigation by introducing a goal-conditioned reinforcement learning (GCRL) framework built on Contrastive RL. It leverages a physics-based CT-to-US simulation pipeline, Contrastive Patient Batching (CPB), and a data augmented contrastive loss to learn robust representations that generalize to unseen patient anatomy, enabling navigation to both standard and interventional views from arbitrary goals. The critic uses a contrastive objective with $f(o_t,a_t,o_g) = \langle \phi(o_t,a_t), \psi(o_g)\rangle$ and a goal-conditioned reward $r_g(s,a) = (1 - \gamma) p(s_{t+1} = s_g | s_t, a_t)$, with training aided by an augmented batch strategy and multiple random perturbations. On a dataset of $789$ patients with evaluation on $140$ test cases, the method achieves an average position error of $6.56$ mm and an angle error of $9.36^{\circ}$, performing competitively with view-specific models and extending to non-standard LAA views, illustrating potential to enhance training and guidance in cardiac ultrasound practice.
Abstract
Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance method based on contrastive learning as goal-conditioned reinforcement learning (GCRL). We augment the previous framework using a novel contrastive patient batching method (CPB) and a data-augmented contrastive loss, both of which we demonstrate are essential to ensure generalization to anatomical variations across patients. The proposed framework enables navigation to both standard diagnostic as well as intricate interventional views with a single model. Our method was developed with a large dataset of 789 patients and obtained an average error of 6.56 mm in position and 9.36 degrees in angle on a testing dataset of 140 patients, which is competitive or superior to models trained on individual views. Furthermore, we quantitatively validate our method's ability to navigate to interventional views such as the Left Atrial Appendage (LAA) view used in LAA closure. Our approach holds promise in providing valuable guidance during transesophageal ultrasound examinations, contributing to the advancement of skill acquisition for cardiac ultrasound practitioners.
