Goal-conditioned reinforcement learning for ultrasound navigation guidance

Abdoul Aziz Amadou; Vivek Singh; Florin C. Ghesu; Young-Ho Kim; Laura Stanciulescu; Harshitha P. Sai; Puneet Sharma; Alistair Young; Ronak Rajani; Kawal Rhode

Goal-conditioned reinforcement learning for ultrasound navigation guidance

Abdoul Aziz Amadou, Vivek Singh, Florin C. Ghesu, Young-Ho Kim, Laura Stanciulescu, Harshitha P. Sai, Puneet Sharma, Alistair Young, Ronak Rajani, Kawal Rhode

TL;DR

The paper tackles training efficiency and generalization for transesophageal echocardiography (TEE) navigation by introducing a goal-conditioned reinforcement learning (GCRL) framework built on Contrastive RL. It leverages a physics-based CT-to-US simulation pipeline, Contrastive Patient Batching (CPB), and a data augmented contrastive loss to learn robust representations that generalize to unseen patient anatomy, enabling navigation to both standard and interventional views from arbitrary goals. The critic uses a contrastive objective with $f(o_t,a_t,o_g) = \langle \phi(o_t,a_t), \psi(o_g)\rangle$ and a goal-conditioned reward $r_g(s,a) = (1 - \gamma) p(s_{t+1} = s_g | s_t, a_t)$, with training aided by an augmented batch strategy and multiple random perturbations. On a dataset of $789$ patients with evaluation on $140$ test cases, the method achieves an average position error of $6.56$ mm and an angle error of $9.36^{\circ}$, performing competitively with view-specific models and extending to non-standard LAA views, illustrating potential to enhance training and guidance in cardiac ultrasound practice.

Abstract

Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance method based on contrastive learning as goal-conditioned reinforcement learning (GCRL). We augment the previous framework using a novel contrastive patient batching method (CPB) and a data-augmented contrastive loss, both of which we demonstrate are essential to ensure generalization to anatomical variations across patients. The proposed framework enables navigation to both standard diagnostic as well as intricate interventional views with a single model. Our method was developed with a large dataset of 789 patients and obtained an average error of 6.56 mm in position and 9.36 degrees in angle on a testing dataset of 140 patients, which is competitive or superior to models trained on individual views. Furthermore, we quantitatively validate our method's ability to navigate to interventional views such as the Left Atrial Appendage (LAA) view used in LAA closure. Our approach holds promise in providing valuable guidance during transesophageal ultrasound examinations, contributing to the advancement of skill acquisition for cardiac ultrasound practitioners.

Goal-conditioned reinforcement learning for ultrasound navigation guidance

TL;DR

and a goal-conditioned reward

, with training aided by an augmented batch strategy and multiple random perturbations. On a dataset of

patients with evaluation on

test cases, the method achieves an average position error of

mm and an angle error of

, performing competitively with view-specific models and extending to non-standard LAA views, illustrating potential to enhance training and guidance in cardiac ultrasound practice.

Abstract

Paper Structure (6 sections, 3 equations, 3 figures, 2 tables)

This paper contains 6 sections, 3 equations, 3 figures, 2 tables.

Introduction
Methodology
Simulation environment
Goal-Conditioned Reinforcement Learning
Experiments and results
Discussion and conclusion

Figures (3)

Figure 1: System overview of Goal-conditioned RL for Ultrasound Navigation. We first segment CTs and generate ultrasound volume reconstructions for rapid sampling during training. The model is trained to reach randomly selected goal views by employing the contrastive patient batching (CPB) mechanism to create a contrastive batch from the collected experience. When deployed, the trained model can navigate to arbitrary views, including standard and interventional views.
Figure 2: Contrastive critic training: We build a contrastive batch using trajectories from two patients with CPB and pass (observation, action) pairs and goal images to the state-action and goal encoders respectively. Goal and observations are augmented $K$ times and we build $K^2$ intermediate matrices (not shown) from the inner product between all the encoded representations, with $Q_M^{Aug}$ as their average. The critic is trained to maximize the similarity between state-action and goal representations of the same trajectories, which corresponds to the diagonal of the matrices.
Figure 3: Example navigation to a view showing the LAA (orange box). The two rightmost pictures are projections showing the desired (green) and current (red) transducer positions.

Goal-conditioned reinforcement learning for ultrasound navigation guidance

TL;DR

Abstract

Goal-conditioned reinforcement learning for ultrasound navigation guidance

Authors

TL;DR

Abstract

Table of Contents

Figures (3)