MambaTrack: Exploiting Dual-Enhancement for Night UAV Tracking
Chunhui Zhang, Li Liu, Hao Wen, Xi Zhou, Yanfeng Wang
TL;DR
MambaTrack tackles night UAV tracking by introducing dual enhancements: a lightweight Mamba-based low-light enhancer (MLLE) and a Cross-modal Mamba Network (CMM) that fuses vision and language cues. The method delivers robust tracking under poor illumination with high efficiency, thanks to linear-complexity Mamba backbones and a lightweight language-augmented search. Key contributions include the MLLE module, the CMM network, and a new vision-language night UAV tracking task via annotated language prompts, validated on five challenging datasets with state-of-the-art performance and substantial memory and speed gains. The practical impact is improved nighttime UAV tracking capability with lower resource demands, enabling more reliable real-time operation in dark environments.
Abstract
Night unmanned aerial vehicle (UAV) tracking is impeded by the challenges of poor illumination, with previous daylight-optimized methods demonstrating suboptimal performance in low-light conditions, limiting the utility of UAV applications. To this end, we propose an efficient mamba-based tracker, leveraging dual enhancement techniques to boost night UAV tracking. The mamba-based low-light enhancer, equipped with an illumination estimator and a damage restorer, achieves global image enhancement while preserving the details and structure of low-light images. Additionally, we advance a cross-modal mamba network to achieve efficient interactive learning between vision and language modalities. Extensive experiments showcase that our method achieves advanced performance and exhibits significantly improved computation and memory efficiency. For instance, our method is 2.8$\times$ faster than CiteTracker and reduces 50.2$\%$ GPU memory. Our codes are available at \url{https://github.com/983632847/Awesome-Multimodal-Object-Tracking}.
