LiteTracker: Leveraging Temporal Causality for Accurate Low-latency Tissue Tracking
Mert Asim Karaoglu, Wenbo Ji, Ahmed Abbas, Nassir Navab, Benjamin Busam, Alexander Ladikos
TL;DR
LiteTracker tackles the challenge of real-time tissue tracking in endoscopy by delivering a frame-by-frame, low-latency variant of long-term point tracking. It extends CoTracker3 with a training-free temporal memory buffer and Exponential Moving Average (EMA) flow initialization to enable efficient online tracking, achieving roughly 7× speedups over the previous method and 2× over the current fastest baselines, while maintaining competitive accuracy on STIR and SuPer datasets. Key ideas include caching expensive correlation features, masking proxies in attention, and initializing new frame locations via $F_t=\alpha (P_{t-1}-P_{t-2})+(1-\alpha)F_{t-1}$ with $\alpha=0.8$, enabling a single-pass refinement ($L=1$). The results demonstrate substantial practical impact for real-time surgical navigation and XR, with code released for reproducibility.
Abstract
Tissue tracking plays a critical role in various surgical navigation and extended reality (XR) applications. While current methods trained on large synthetic datasets achieve high tracking accuracy and generalize well to endoscopic scenes, their runtime performances fail to meet the low-latency requirements necessary for real-time surgical applications. To address this limitation, we propose LiteTracker, a low-latency method for tissue tracking in endoscopic video streams. LiteTracker builds on a state-of-the-art long-term point tracking method, and introduces a set of training-free runtime optimizations. These optimizations enable online, frame-by-frame tracking by leveraging a temporal memory buffer for efficient feature reuse and utilizing prior motion for accurate track initialization. LiteTracker demonstrates significant runtime improvements being around 7x faster than its predecessor and 2x than the state-of-the-art. Beyond its primary focus on efficiency, LiteTracker delivers high-accuracy tracking and occlusion prediction, performing competitively on both the STIR and SuPer datasets. We believe LiteTracker is an important step toward low-latency tissue tracking for real-time surgical applications in the operating room. Our code is publicly available at https://github.com/ImFusionGmbH/lite-tracker.
