Table of Contents
Fetching ...

MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking

Yuelin Zhang, Long Lei, Wanquan Yan, Tianyi Zhang, Raymond Shing-Yan Tang, Shing Shin Cheng

TL;DR

This work addresses the challenge of robust ultrasound needle tracking under noise, artifacts, and intermittent tip visibility. It introduces MambaXCTrack, a Mamba-based tracker that integrates $SSMX\text{-}Corr$ for long-range semantic modeling, Cross-map Interleaved Scan for local interaction, and an implicit low-level motion descriptor as a non-visual prompt, forming an end-to-end tracking framework. Across phantom and tissue experiments, it achieves state-of-the-art performance on $AUC$, $P$, and $P_{norm}$ with near-ground-truth precision ($0.34\pm0.55$ mm average) at a real-time speed of $34.9$ FPS, demonstrating robustness to challenging ultrasound imaging conditions. The findings highlight the value of combining long-range SSM-based cross-correlation with motion-informed prompts for reliable interventional tracking and provide a foundation for efficient Mamba-based trackers in ultrasound-guided procedures.

Abstract

Ultrasound (US)-guided needle insertion is widely employed in percutaneous interventions. However, providing feedback on the needle tip position via US imaging presents challenges due to noise, artifacts, and the thin imaging plane of US, which degrades needle features and leads to intermittent tip visibility. In this paper, a Mamba-based US needle tracker MambaXCTrack utilizing structured state space models cross-correlation (SSMX-Corr) and implicit motion prompt is proposed, which is the first application of Mamba in US needle tracking. The SSMX-Corr enhances cross-correlation by long-range modeling and global searching of distant semantic features between template and search maps, benefiting the tracking under noise and artifacts by implicitly learning potential distant semantic cues. By combining with cross-map interleaved scan (CIS), local pixel-wise interaction with positional inductive bias can also be introduced to SSMX-Corr. The implicit low-level motion descriptor is proposed as a non-visual prompt to enhance tracking robustness, addressing the intermittent tip visibility problem. Extensive experiments on a dataset with motorized needle insertion in both phantom and tissue samples demonstrate that the proposed tracker outperforms other state-of-the-art trackers while ablation studies further highlight the effectiveness of each proposed tracking module.

MambaXCTrack: Mamba-based Tracker with SSM Cross-correlation and Motion Prompt for Ultrasound Needle Tracking

TL;DR

This work addresses the challenge of robust ultrasound needle tracking under noise, artifacts, and intermittent tip visibility. It introduces MambaXCTrack, a Mamba-based tracker that integrates for long-range semantic modeling, Cross-map Interleaved Scan for local interaction, and an implicit low-level motion descriptor as a non-visual prompt, forming an end-to-end tracking framework. Across phantom and tissue experiments, it achieves state-of-the-art performance on , , and with near-ground-truth precision ( mm average) at a real-time speed of FPS, demonstrating robustness to challenging ultrasound imaging conditions. The findings highlight the value of combining long-range SSM-based cross-correlation with motion-informed prompts for reliable interventional tracking and provide a foundation for efficient Mamba-based trackers in ultrasound-guided procedures.

Abstract

Ultrasound (US)-guided needle insertion is widely employed in percutaneous interventions. However, providing feedback on the needle tip position via US imaging presents challenges due to noise, artifacts, and the thin imaging plane of US, which degrades needle features and leads to intermittent tip visibility. In this paper, a Mamba-based US needle tracker MambaXCTrack utilizing structured state space models cross-correlation (SSMX-Corr) and implicit motion prompt is proposed, which is the first application of Mamba in US needle tracking. The SSMX-Corr enhances cross-correlation by long-range modeling and global searching of distant semantic features between template and search maps, benefiting the tracking under noise and artifacts by implicitly learning potential distant semantic cues. By combining with cross-map interleaved scan (CIS), local pixel-wise interaction with positional inductive bias can also be introduced to SSMX-Corr. The implicit low-level motion descriptor is proposed as a non-visual prompt to enhance tracking robustness, addressing the intermittent tip visibility problem. Extensive experiments on a dataset with motorized needle insertion in both phantom and tissue samples demonstrate that the proposed tracker outperforms other state-of-the-art trackers while ablation studies further highlight the effectiveness of each proposed tracking module.

Paper Structure

This paper contains 12 sections, 6 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Structure overview of the proposed MambaXCTrack. The ResNet backbone is cascaded with four Mamba heads. Each Mamba head has the same structure. $z$ and $x$ are the embedding features of template $Z$ and search $X$.
  • Figure 2: A demonstration of the cross-map interleaved scan (CIS). CIS is performed in four directions, from which four sequences are regrouped separately. Assuming $z$ and $x_i$ are all in $4\times4$ with elements numbered $1\sim16$, the induced four sequences with $m$ concatenated are shown.
  • Figure 3: Experimental setup. The in-plane case (imaging plane aligned parallel with needle trajectory) is shown as an example.
  • Figure 4: Tracking demonstration of MambaXCTrack against three SOTA trackers in three scenarios. The view is zoomed in. The left column is the forward procedure, while the right is backward. The ground-truth tip position and needle trajectory are marked. See more examples in the supplementary video.