A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation

Valentina Scarponi; Michel Duprez; Florent Nageotte; Stéphane Cotin

A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation

Valentina Scarponi, Michel Duprez, Florent Nageotte, Stéphane Cotin

TL;DR

This paper proposes a zero-shot learning strategy for three-dimensional autonomous endovascular navigation using a very small training set of branching patterns, and proves its ability to navigate unseen geometries with different characteristics, thanks to a nearly shape-invariant observation space.

Abstract

Purpose: The treatment of cardiovascular diseases requires complex and challenging navigation of a guidewire and catheter. This often leads to lengthy interventions during which the patient and clinician are exposed to X-ray radiation. Deep Reinforcement Learning approaches have shown promise in learning this task and may be the key to automating catheter navigation during robotized interventions. Yet, existing training methods show limited capabilities at generalizing to unseen vascular anatomies, requiring to be retrained each time the geometry changes. Methods: In this paper, we propose a zero-shot learning strategy for three-dimensional autonomous endovascular navigation. Using a very small training set of branching patterns, our reinforcement learning algorithm is able to learn a control that can then be applied to unseen vascular anatomies without retraining. Results: We demonstrate our method on 4 different vascular systems, with an average success rate of 95% at reaching random targets on these anatomies. Our strategy is also computationally efficient, allowing the training of our controller to be performed in only 2 hours. Conclusion: Our training method proved its ability to navigate unseen geometries with different characteristics, thanks to a nearly shape-invariant observation space.

A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation

TL;DR

Abstract

Paper Structure (13 sections, 5 equations, 4 figures)

This paper contains 13 sections, 5 equations, 4 figures.

Introduction
Materials and Methods
Training environment
Simulation of guidewire nagivation
Training strategy
Soft Actor-Critic algorithm
Zero-shot Reinforcement Learning Strategy
Nearly shape-invariant observation space
Design of the training anatomies
Results
Training anatomies
Navigation test on complex vascular trees
Conclusion

Figures (4)

Figure 1: Geometries used to test the algorithm. The blue dots represent the insertion points, while the green dots show all possible target locations. The red circle shows an example of a bifurcation region.
Figure 2: Our observation space is composed of: 1) $t_i \cdot c_j$, with $i \in [1;3] \in \mathbb{Z}$ and $j \in [1;3]\in \mathbb{Z}$ (a), 2) the normalized distance between the tip of the guidewire and the target, 3) the chosen action, 4) $k_p \cdot w_{p}$ (b) 5) $v \cdot c_i$ (c).
Figure 3: Geometries used to study the sensitivity of our training with respect to changes in position (a), orientation (a) and shape (b,c,d). The blue dot marks the entry branch, the green ones the exit.
Figure 4: Geometries chosen to train the agent and to cover the whole observation space. In each model, the blue point represents the starting point, and the green one the target location.

A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation

TL;DR

Abstract

A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)