Table of Contents
Fetching ...

Prompt-Driven Temporal Domain Adaptation for Nighttime UAV Tracking

Changhong Fu, Yiheng Wang, Liangliang Yao, Guangze Zheng, Haobo Zuo, Jia Pan

TL;DR

A prompt-driven temporal domain adaptation training framework to fully utilize temporal contexts for challenging nighttime UAV tracking, i.e., TDA, is proposed and a new benchmark for long-term nighttime UAV tracking is constructed.

Abstract

Nighttime UAV tracking under low-illuminated scenarios has achieved great progress by domain adaptation (DA). However, previous DA training-based works are deficient in narrowing the discrepancy of temporal contexts for UAV trackers. To address the issue, this work proposes a prompt-driven temporal domain adaptation training framework to fully utilize temporal contexts for challenging nighttime UAV tracking, i.e., TDA. Specifically, the proposed framework aligns the distribution of temporal contexts from daytime and nighttime domains by training the temporal feature generator against the discriminator. The temporal-consistent discriminator progressively extracts shared domain-specific features to generate coherent domain discrimination results in the time series. Additionally, to obtain high-quality training samples, a prompt-driven object miner is employed to precisely locate objects in unannotated nighttime videos. Moreover, a new benchmark for long-term nighttime UAV tracking is constructed. Exhaustive evaluations on both public and self-constructed nighttime benchmarks demonstrate the remarkable performance of the tracker trained in TDA framework, i.e., TDA-Track. Real-world tests at nighttime also show its practicality. The code and demo videos are available at https://github.com/vision4robotics/TDA-Track.

Prompt-Driven Temporal Domain Adaptation for Nighttime UAV Tracking

TL;DR

A prompt-driven temporal domain adaptation training framework to fully utilize temporal contexts for challenging nighttime UAV tracking, i.e., TDA, is proposed and a new benchmark for long-term nighttime UAV tracking is constructed.

Abstract

Nighttime UAV tracking under low-illuminated scenarios has achieved great progress by domain adaptation (DA). However, previous DA training-based works are deficient in narrowing the discrepancy of temporal contexts for UAV trackers. To address the issue, this work proposes a prompt-driven temporal domain adaptation training framework to fully utilize temporal contexts for challenging nighttime UAV tracking, i.e., TDA. Specifically, the proposed framework aligns the distribution of temporal contexts from daytime and nighttime domains by training the temporal feature generator against the discriminator. The temporal-consistent discriminator progressively extracts shared domain-specific features to generate coherent domain discrimination results in the time series. Additionally, to obtain high-quality training samples, a prompt-driven object miner is employed to precisely locate objects in unannotated nighttime videos. Moreover, a new benchmark for long-term nighttime UAV tracking is constructed. Exhaustive evaluations on both public and self-constructed nighttime benchmarks demonstrate the remarkable performance of the tracker trained in TDA framework, i.e., TDA-Track. Real-world tests at nighttime also show its practicality. The code and demo videos are available at https://github.com/vision4robotics/TDA-Track.
Paper Structure (18 sections, 8 equations, 7 figures, 3 tables)

This paper contains 18 sections, 8 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Comparison with previous domain adaptation (DA) training framework for nighttime UAV tracking. The proposed temporal domain adaptation (TDA) training framework generates the temporal contexts among daytime and nighttime images, and then narrows the feature discrepancy of temporal contexts from different domains with the temporal-consistent discriminator. (Image frames are from GOT-10k huang2019got and NAT2021-trainye2022unsupervised.)
  • Figure 2: Overview of the temporal day-to-night domain adaptation framework for nighttime UAV tracking. The temporal generator learns to generate temporal contexts that are more adaptive to the nighttime domain. The temporal-consistent discriminator is trained to classify features and temporal contexts into different domains based on progressively extracted domain-specific representations. Prompt-driven object mining locates valuable objects with text prompts and builds their smooth trajectories in the time series. (Image frames are from GOT-10k huang2019got and NAT2021-trainye2022unsupervised.)
  • Figure 3: The structure of the temporal-consistent discriminator. $\mathrm{\mathbf{M_{\textit{i}}}}$ denotes the temporal contexts encoded from the first $i$ frames. The utilization of temporal contexts is marked with red dotted lines. Better representations oriented for daytime or nighttime attributes are progressively extracted, which enables more robust discrimination.
  • Figure 4: The first frames of typical scenes in NAT2024-1. The tracking objects are marked with green boxes. The dark environments pose a great challenge to nighttime UAV tracking.
  • Figure 5: Long-term nighttime UAV tracking performance of TDA-Track and lightweight trackers on NUT-L and NAT2024-1. TDA-Track ranks first in all three metrics with remarkable improvement.
  • ...and 2 more figures