Table of Contents
Fetching ...

A-MFST: Adaptive Multi-Flow Sparse Tracker for Real-Time Tissue Tracking Under Occlusion

Yuxin Chen, Zijian Wu, Adam Schmidt, Septimiu E. Salcudean

TL;DR

This work extends SENDD to enhance occlusion detection and tracking consistency while maintaining real-time performance, and improves tracking accuracy and reliability by integrating SAM2 for robust occlusion handling and employing forward–backward consistency for optimal frame selection.

Abstract

Purpose: Tissue tracking is critical for downstream tasks in robot-assisted surgery. The Sparse Efficient Neural Depth and Deformation (SENDD) model has previously demonstrated accurate and real-time sparse point tracking, but struggled with occlusion handling. This work extends SENDD to enhance occlusion detection and tracking consistency while maintaining real-time performance. Methods: We use the Segment Anything Model2 (SAM2) to detect and mask occlusions by surgical tools, and we develop and integrate into SENDD an Adaptive Multi-Flow Sparse Tracker (A-MFST) with forward-backward consistency metrics, to enhance occlusion and uncertainty estimation. A-MFST is an unsupervised variant of the Multi-Flow Dense Tracker (MFT). Results: We evaluate our approach on the STIR dataset and demonstrate a significant improvement in tracking accuracy under occlusion, reducing average tracking errors by 12 percent in Mean Endpoint Error (MEE) and showing a 6 percent improvement in the averaged accuracy over thresholds of 4, 8, 16, 32, and 64 pixels. The incorporation of forward-backward consistency further improves the selection of optimal tracking paths, reducing drift and enhancing robustness. Notably, these improvements were achieved without compromising the model's real-time capabilities. Conclusions: Using A-MFST and SAM2, we enhance SENDD's ability to track tissue in real time under instrument and tissue occlusions.

A-MFST: Adaptive Multi-Flow Sparse Tracker for Real-Time Tissue Tracking Under Occlusion

TL;DR

This work extends SENDD to enhance occlusion detection and tracking consistency while maintaining real-time performance, and improves tracking accuracy and reliability by integrating SAM2 for robust occlusion handling and employing forward–backward consistency for optimal frame selection.

Abstract

Purpose: Tissue tracking is critical for downstream tasks in robot-assisted surgery. The Sparse Efficient Neural Depth and Deformation (SENDD) model has previously demonstrated accurate and real-time sparse point tracking, but struggled with occlusion handling. This work extends SENDD to enhance occlusion detection and tracking consistency while maintaining real-time performance. Methods: We use the Segment Anything Model2 (SAM2) to detect and mask occlusions by surgical tools, and we develop and integrate into SENDD an Adaptive Multi-Flow Sparse Tracker (A-MFST) with forward-backward consistency metrics, to enhance occlusion and uncertainty estimation. A-MFST is an unsupervised variant of the Multi-Flow Dense Tracker (MFT). Results: We evaluate our approach on the STIR dataset and demonstrate a significant improvement in tracking accuracy under occlusion, reducing average tracking errors by 12 percent in Mean Endpoint Error (MEE) and showing a 6 percent improvement in the averaged accuracy over thresholds of 4, 8, 16, 32, and 64 pixels. The incorporation of forward-backward consistency further improves the selection of optimal tracking paths, reducing drift and enhancing robustness. Notably, these improvements were achieved without compromising the model's real-time capabilities. Conclusions: Using A-MFST and SAM2, we enhance SENDD's ability to track tissue in real time under instrument and tissue occlusions.

Paper Structure

This paper contains 8 sections, 4 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overall structure of the tracking algorithm: (a) Multi-Flow Sparse Tracker (MFST) (b) Adaptive Multi-Flow Sparse Tracker (A-MFST)
  • Figure 2: Illustration of the Initialization Process for SAM2: (a) Original Image of Frame 0; (b) Depth Map; (c) Thresholded Depth and Initilized Query Points; and (d) Mask Labeled Image.
  • Figure 3: Overall structure of the Multi-Flow Sparse Tracker (MFST)
  • Figure 4: Overall structure of the Adaptive Multi-Flow Sparst Tracker(A-MFST)
  • Figure 5: Mean Endpoint Error Over Clip Duration.
  • ...and 1 more figures