Autonomous Path Planning for Intercostal Robotic Ultrasound Imaging Using Reinforcement Learning

Yuan Bi; Cheng Qian; Zhicheng Zhang; Nassir Navab; Zhongliang Jiang

Autonomous Path Planning for Intercostal Robotic Ultrasound Imaging Using Reinforcement Learning

Yuan Bi, Cheng Qian, Zhicheng Zhang, Nassir Navab, Zhongliang Jiang

TL;DR

This work tackles autonomous intercostal ultrasound path planning by training a reinforcement learning agent in a CT-atlas–based simulated environment to generate full-volume, shadow-free trajectories beneath the rib cage. By encoding the scene as a 3-channel voxelized state and navigating with a four‑DoF cylindrical action space plus a readjust switch, the method uses a double dueling DQN to optimize a composite reward that favors coverage, attenuation minimization, and shadow avoidance. Experiments on unseen CTs with varying target sizes and multiple targets demonstrate robust planning performance and reveal how intercostal geometry affects success, validating the approach as a foundation for fully autonomous RUSS systems. The work advances towards clinically practical autonomous US scanning by coupling high-level path planning with future registration and robotic control modules for real-world deployment.

Abstract

Ultrasound (US) has been widely used in daily clinical practice for screening internal organs and guiding interventions. However, due to the acoustic shadow cast by the subcutaneous rib cage, the US examination for thoracic application is still challenging. To fully cover and reconstruct the region of interest in US for diagnosis, an intercostal scanning path is necessary. To tackle this challenge, we present a reinforcement learning (RL) approach for planning scanning paths between ribs to monitor changes in lesions on internal organs, such as the liver and heart, which are covered by rib cages. Structured anatomical information of the human skeleton is crucial for planning these intercostal paths. To obtain such anatomical insight, an RL agent is trained in a virtual environment constructed using computational tomography (CT) templates with randomly initialized tumors of various shapes and locations. In addition, task-specific state representation and reward functions are introduced to ensure the convergence of the training process while minimizing the effects of acoustic attenuation and shadows during scanning. To validate the effectiveness of the proposed approach, experiments have been carried out on unseen CTs with randomly defined single or multiple scanning targets. The results demonstrate the efficiency of the proposed RL framework in planning non-shadowed US scanning trajectories in areas with limited acoustic access.

Autonomous Path Planning for Intercostal Robotic Ultrasound Imaging Using Reinforcement Learning

TL;DR

Abstract

Paper Structure (25 sections, 11 equations, 6 figures, 2 tables)

This paper contains 25 sections, 11 equations, 6 figures, 2 tables.

Introduction
Related Work
Path Planning for RUSS
Reinforcement Learning for RUSS
Preliminaries
Deep Q-Learning
Double Deep Q-Learning
Dueling Deep Q-Learning
Intercostal Path Planning for Robotic US
Environment Setup
State Design
Action Design
Reward Design
Coverage Level of Objects of Interest
Attenuation Minimization
...and 10 more sections

Figures (6)

Figure 1: (a) Our target is to plan a US scanning trajectory to fully cover a specific area under ribs. Such area can be an already identified tumor or some suspicious regions defined by the doctors based on CT images. (b) Combining with the tracking information, the final goal is to realize the US reconstruction of the selected area. The planned trajectory should avoid the occlusion of bones and try to perform the US acquisition through intercostal gaps. Two representing probe positions are shown in (a) and the resulting US images are displayed in (c) and (d), respectively.
Figure 2: Overview of the proposed framework. The scanning target is determined by the doctors on CT and input to a simulator. The overall scanning context is depicted using a 3-channel 3D matrix. Each element within the matrix corresponds to a voxel within the space, and each channel indicates the presence of either a tumor, bone, or an ultrasound ray within the corresponding voxel. Subsequently, an RL model is employed to generate the scanning trajectory. In the end, the planned path is projected back to the skin surface of the patient.
Figure 3: Illustrations of the voxelization process.
Figure 4: (a) Illustrations of the cylindrical coordinate system. (b) Mapping from the cylinder surface to the skin surface.
Figure 5: Structure of the dueling deep Q-learning network, where $h$ represents the translational movements along the center line of the cylinder coordinate system, $\theta$ denotes the rotation around the cylinder center line, $phi$ is the rotation angle around z-axis of the probe, and $psi$ represents the rotation around the long axis of the probe footprint. $^+$ and $^-$ describe the direction of the actions.
...and 1 more figures

Autonomous Path Planning for Intercostal Robotic Ultrasound Imaging Using Reinforcement Learning

TL;DR

Abstract

Autonomous Path Planning for Intercostal Robotic Ultrasound Imaging Using Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)