Table of Contents
Fetching ...

Sim4EndoR: A Reinforcement Learning Centered Simulation Platform for Task Automation of Endovascular Robotics

Tianliang Yao, Madaoji Ban, Bo Lu, Zhiqiang Pei, Peng Qi

TL;DR

The paper tackles operator-dependency in robotic PCI by introducing Sim4EndoR, a 3D reinforcement learning–driven simulation platform designed for autonomous endovascular task training and evaluation with safe sim-to-real translation. It formulates the PCI task as a Markov Decision Process with a curvature-aware reward on a vascular manifold, and provides a SOFA-based, physics-grounded guidewire model coupled to neural policy learning for realistic control. Key contributions include a curvature-distance reward design, a modular 3D simulation pipeline (SOFA, Blender, Onshape), and demonstrable policy deployment on real endovascular hardware showing 70%+ success across tasks. The results indicate meaningful sim-to-real viability, promising reductions in hardware trials and potential improvements in safety and consistency of PCI interventions, while acknowledging ongoing challenges in tissue property fidelity and precise tip dynamics.

Abstract

Robotic-assisted percutaneous coronary intervention (PCI) holds considerable promise for elevating precision and safety in cardiovascular procedures. Nevertheless, current systems heavily depend on human operators, resulting in variability and the potential for human error. To tackle these challenges, Sim4EndoR, an innovative reinforcement learning (RL) based simulation environment, is first introduced to bolster task-level autonomy in PCI. This platform offers a comprehensive and risk-free environment for the development, evaluation, and refinement of potential autonomous systems, enhancing data collection efficiency and minimizing the need for costly hardware trials. A notable aspect of the groundbreaking Sim4EndoR is its reward function, which takes into account the anatomical constraints of the vascular environment, utilizing the geometric characteristics of vessels to steer the learning process. By seamlessly integrating advanced physical simulations with neural network-driven policy learning, Sim4EndoR fosters efficient sim-to-real translation, paving the way for safer, more consistent robotic interventions in clinical practice, ultimately improving patient outcomes.

Sim4EndoR: A Reinforcement Learning Centered Simulation Platform for Task Automation of Endovascular Robotics

TL;DR

The paper tackles operator-dependency in robotic PCI by introducing Sim4EndoR, a 3D reinforcement learning–driven simulation platform designed for autonomous endovascular task training and evaluation with safe sim-to-real translation. It formulates the PCI task as a Markov Decision Process with a curvature-aware reward on a vascular manifold, and provides a SOFA-based, physics-grounded guidewire model coupled to neural policy learning for realistic control. Key contributions include a curvature-distance reward design, a modular 3D simulation pipeline (SOFA, Blender, Onshape), and demonstrable policy deployment on real endovascular hardware showing 70%+ success across tasks. The results indicate meaningful sim-to-real viability, promising reductions in hardware trials and potential improvements in safety and consistency of PCI interventions, while acknowledging ongoing challenges in tissue property fidelity and precise tip dynamics.

Abstract

Robotic-assisted percutaneous coronary intervention (PCI) holds considerable promise for elevating precision and safety in cardiovascular procedures. Nevertheless, current systems heavily depend on human operators, resulting in variability and the potential for human error. To tackle these challenges, Sim4EndoR, an innovative reinforcement learning (RL) based simulation environment, is first introduced to bolster task-level autonomy in PCI. This platform offers a comprehensive and risk-free environment for the development, evaluation, and refinement of potential autonomous systems, enhancing data collection efficiency and minimizing the need for costly hardware trials. A notable aspect of the groundbreaking Sim4EndoR is its reward function, which takes into account the anatomical constraints of the vascular environment, utilizing the geometric characteristics of vessels to steer the learning process. By seamlessly integrating advanced physical simulations with neural network-driven policy learning, Sim4EndoR fosters efficient sim-to-real translation, paving the way for safer, more consistent robotic interventions in clinical practice, ultimately improving patient outcomes.

Paper Structure

This paper contains 16 sections, 4 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Framework of Sim4EndoR for achieving embodied AI in PCI procedures: (a) Traditional interventional cardiologists require a prolonged learning curve to attain proficiency in intricate guidewire manipulation techniques. Additionally, engaging in numerous interventional procedures over an extended period can cause fatigue and radiation-induced illnesses among cardiologists. This, in turn, may lead to unsuccessful procedures or associated complications in complex cases, such as vessel perforation. (b) The proposed Sim4EndoR encompasses a simulation platform dedicated to policy training. Within this platform, the RL agent simulates the actions and kinematics of the guidewire, learning optimal manipulation strategies through observation and reward mechanisms. Furthermore, Sim4EndoR incorporates a robotic guidewire delivery system designed for real-world applications. Policy deployment is facilitated by the physical manipulation system, enabling precise guidewire manipulation that ultimately contributes to successful interventions.
  • Figure 2: Illustration is the guidewire navigation task within the Simplified Vascular Phantom. (a) The vascular network with bifurcation points. (b) Task A: Navigation to End Point A. (c) Task B: Navigation to End Point B. (d) and (e) show the guidewire reaching the designated End Points A and B, respectively, within the simulated environment.
  • Figure 3: The simulation of the Vascular Phantom, which corresponds to a real-world physical model for deployment, initiates navigation from the Start Point with the objective of accurately reaching one of the targeted End Points (A or B).
  • Figure 4: The sequence of keyframes extracted from recorded videos demonstrates the autonomous navigation of the guidewire, which is achieved based on the proposed skill learning paradigm. Notably, the guidewire adeptly navigates through multiple vascular bifurcations to reach the designated target location, all without relying on any real-time position feedback.