UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement Guidance

Haojun Jiang; Teng Wang; Zhenguo Sun; Yulin Wang; Yang Yue; Yu Sun; Ning Jia; Meng Li; Shaqi Luo; Shiji Song; Gao Huang

UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement Guidance

Haojun Jiang, Teng Wang, Zhenguo Sun, Yulin Wang, Yang Yue, Yu Sun, Ning Jia, Meng Li, Shaqi Luo, Shiji Song, Gao Huang

TL;DR

UltraSeP introduces sequence-aware pre-training to address individual variability in echocardiography by learning personalized cardiac structures from scanning sequences. The method uses a vision transformer and an action encoder to predict masked image features and probe movements within a trajectory, with segmental sampling and EMA-based targets to enhance learning. Downstream, a shared sequence transformer with ten plane-specific heads enables accurate probe guidance toward ten standard planes, outperforming diverse baselines with statistically significant improvements. The approach demonstrates strong generalization, robustness to cardiac-cycle variation, and real-time inference potential, supporting AI-assisted or robotic echocardiography in clinical settings.

Abstract

Echocardiography is an essential medical technique for diagnosing cardiovascular diseases, but its high operational complexity has led to a shortage of trained professionals. To address this issue, we introduce a novel probe movement guidance algorithm that has the potential to be applied in guiding robotic systems or novices with probe pose adjustment for high-quality standard plane image acquisition.Cardiac ultrasound faces two major challenges: (1) the inherently complex structure of the heart, and (2) significant individual variations. Previous works have only learned the population-averaged structure of the heart rather than personalized cardiac structures, leading to a performance bottleneck. Clinically, we observe that sonographers dynamically adjust their interpretation of a patient's cardiac anatomy based on prior scanning sequences, consequently refining their scanning strategies. Inspired by this, we propose a novel sequence-aware self-supervised pre-training method. Specifically, our approach learns personalized three-dimensional cardiac structural features by predicting the masked-out image features and probe movement actions in a scanning sequence. We hypothesize that if the model can predict the missing content it has acquired a good understanding of personalized cardiac structure. Extensive experiments on a large-scale expert scanning dataset with 1.67 million samples demonstrate that our proposed sequence-aware paradigm can effectively reduce probe guidance errors compared to other advanced baseline methods.

UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement Guidance

TL;DR

Abstract

Paper Structure (16 sections, 9 equations, 13 figures, 7 tables)

This paper contains 16 sections, 9 equations, 13 figures, 7 tables.

Introduction
Related Work
Dataset and method
Echocardiographic Scanning Dataset
UltraSeP: Sequence-aware Pre-training
Sampling Strategy
Downstream Transfer
Experiments
Implementation Details
Comparison with Baselines
Analysis on Demographic Variables
Ablation and Discussion of Key Components
Ablation of Mask Hyper-parameters
Visualization
Efficiency
...and 1 more sections

Figures (13)

Figure 1: Motivation of our work. During scanning, sonographers develop a cognitive understanding of the patient's cardiac anatomy using past trajectories (plane images and probe poses), which guides subsequent scanning.
Figure 2: Illustration of the probe movement guidance task. (a) Cardiac ultrasound requires probe pose adjustment to capture different standard plane images. (b) We develop a guidance model that predicts probe movements for target plane acquisition from scan trajectories. (c) With the guidance of this model, ultrasound robotic system or novices have the potential to perform cardiac ultrasound examinations more effectively.
Figure 3: Echocardiographic scanning dataset. (a) Dataset statistics. The dataset was collected by two senior sonographers who performed scans on 238 adult subjects, two scans per individual. This resulted in a large-scale dataset containing 1.67 million samples. (b) Dataset collection system. The sonographer operated a probe attached to the end of a robotic arm for scanning. The system records the ultrasound images along with the corresponding probe poses. (c) Collected ten standard planes. The sonographer scanned six standard planes from the parasternal window and four standard planes from the apical window. Images are sourced from the guideline mitchell2019guidelines.
Figure 4: Diagram illustrating the sequence-aware pre-training method. Please zoom in to view. The input consists of a scanning sequence, including ultrasound images and corresponding relative movement actions between them. A portion of the input is randomly masked, and the model is required to recover masked visual features and actions.
Figure 5: Diagram illustrating sampling protocol and strategy. (a1) Unidirectional protocol samples from only one side of the current plane. (a2) Bidirectional protocol allows samples the entire scan, leveraging as much individual cardiac structural information as possible. (b) Segmental sampling ensures that sampled instances span the entire interval.
...and 8 more figures

UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement Guidance

TL;DR

Abstract

UltraSeP: Sequence-aware Pre-training for Echocardiography Probe Movement Guidance

Authors

TL;DR

Abstract

Table of Contents

Figures (13)