Table of Contents
Fetching ...

Dual Prompt-Driven Feature Encoding for Nighttime UAV Tracking

Yiheng Wang, Changhong Fu, Liangliang Yao, Haobo Zuo, Zijie Zhang

Abstract

Robust feature encoding constitutes the foundation of UAV tracking by enabling the nuanced perception of target appearance and motion, thereby playing a pivotal role in ensuring reliable tracking. However, existing feature encoding methods often overlook critical illumination and viewpoint cues, which are essential for robust perception under challenging nighttime conditions, leading to degraded tracking performance. To overcome the above limitation, this work proposes a dual prompt-driven feature encoding method that integrates prompt-conditioned feature adaptation and context-aware prompt evolution to promote domain-invariant feature encoding. Specifically, the pyramid illumination prompter is proposed to extract multi-scale frequency-aware illumination prompts. %The dynamic viewpoint prompter adapts the sampling to different viewpoints, enabling the tracker to learn view-invariant features. The dynamic viewpoint prompter modulates deformable convolution offsets to accommodate viewpoint variations, enabling the tracker to learn view-invariant features. Extensive experiments validate the effectiveness of the proposed dual prompt-driven tracker (DPTracker) in tackling nighttime UAV tracking. Ablation studies highlight the contribution of each component in DPTracker. Real-world tests under diverse nighttime UAV tracking scenarios further demonstrate the robustness and practical utility. The code and demo videos are available at https://github.com/yiheng-wang-duke/DPTracker.

Dual Prompt-Driven Feature Encoding for Nighttime UAV Tracking

Abstract

Robust feature encoding constitutes the foundation of UAV tracking by enabling the nuanced perception of target appearance and motion, thereby playing a pivotal role in ensuring reliable tracking. However, existing feature encoding methods often overlook critical illumination and viewpoint cues, which are essential for robust perception under challenging nighttime conditions, leading to degraded tracking performance. To overcome the above limitation, this work proposes a dual prompt-driven feature encoding method that integrates prompt-conditioned feature adaptation and context-aware prompt evolution to promote domain-invariant feature encoding. Specifically, the pyramid illumination prompter is proposed to extract multi-scale frequency-aware illumination prompts. %The dynamic viewpoint prompter adapts the sampling to different viewpoints, enabling the tracker to learn view-invariant features. The dynamic viewpoint prompter modulates deformable convolution offsets to accommodate viewpoint variations, enabling the tracker to learn view-invariant features. Extensive experiments validate the effectiveness of the proposed dual prompt-driven tracker (DPTracker) in tackling nighttime UAV tracking. Ablation studies highlight the contribution of each component in DPTracker. Real-world tests under diverse nighttime UAV tracking scenarios further demonstrate the robustness and practical utility. The code and demo videos are available at https://github.com/yiheng-wang-duke/DPTracker.
Paper Structure (15 sections, 10 equations, 6 figures, 2 tables)

This paper contains 15 sections, 10 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The overall comparison between the SOTA method and the proposed DPTracker. With prompt-feature interaction, DPTracker learns adaptive features and outperforms SOTA trackers in nighttime UAV tracking.
  • Figure 2: Overall model architecture of DPTracker. The interaction between prompt tokens and features integrates prompt semantics into the features while updating the prompt tokens, thereby enhancing nighttime UAV tracking with more adaptive representations. The images are from NAT2021-testudat.
  • Figure 3: The architecture of the pyramid illumination prompter and the dynamic viewpoint prompter. The illumination prompter leverages a learnable pyramid network to decompose images and aggregate multi-scale features, while the dynamic viewpoint prompter combines standard and deformable convolutions to adaptively capture viewpoint information under UAV perspectives. The images are from DarkTrack2021 darktrack2021.
  • Figure 4: Tracking result visualization of DPTracker-B along with other top trackers ostrackdcptsamdaudatsiamban. The sequences are selected from UAVDark135 uavdark135. The proposed DPTracker-B shows more robust and precise tracking performance under diverse nighttime UAV tracking scenarios.
  • Figure 5: Attribute-based tracking performance evaluation. As shown in the left figure, DPTracker-T consistently achieves the best performance across all 8 attributes on NAT2021-testudat, outperforming 5 other lightweight trackerssiamapnsiamapn++tctracktctrack++avtrack. In the right figure, DPTracker-B also demonstrates substantial performance improvements compared to other well-performing trackers siambanudatsamdadcptostrack.
  • ...and 1 more figures