Table of Contents
Fetching ...

DaDiff: Domain-aware Diffusion Model for Nighttime UAV Tracking

Haobo Zuo, Changhong Fu, Guangze Zheng, Liangliang Yao, Kunhan Lu, Jia Pan

TL;DR

A novel progressive alignment paradigm, named domain-aware diffusion model (DaDiff), aligning nighttime LR object features to the daytime by virtue of progressive and stable generations is proposed, and an elaborate nighttime UAV tracking benchmark is constructed for LR objects.

Abstract

Domain adaptation is an inspiring solution to the misalignment issue of day/night image features for nighttime UAV tracking. However, the one-step adaptation paradigm is inadequate in addressing the prevalent difficulties posed by low-resolution (LR) objects when viewed from the UAVs at night, owing to the blurry edge contour and limited detail information. Moreover, these approaches struggle to perceive LR objects disturbed by nighttime noise. To address these challenges, this work proposes a novel progressive alignment paradigm, named domain-aware diffusion model (DaDiff), aligning nighttime LR object features to the daytime by virtue of progressive and stable generations. The proposed DaDiff includes an alignment encoder to enhance the detail information of nighttime LR objects, a tracking-oriented layer designed to achieve close collaboration with tracking tasks, and a successive distribution discriminator presented to distinguish different feature distributions at each diffusion timestep successively. Furthermore, an elaborate nighttime UAV tracking benchmark is constructed for LR objects, namely NUT-LR, consisting of 100 annotated sequences. Exhaustive experiments have demonstrated the robustness and feature alignment ability of the proposed DaDiff. The source code and video demo are available at https://github.com/vision4robotics/DaDiff.

DaDiff: Domain-aware Diffusion Model for Nighttime UAV Tracking

TL;DR

A novel progressive alignment paradigm, named domain-aware diffusion model (DaDiff), aligning nighttime LR object features to the daytime by virtue of progressive and stable generations is proposed, and an elaborate nighttime UAV tracking benchmark is constructed for LR objects.

Abstract

Domain adaptation is an inspiring solution to the misalignment issue of day/night image features for nighttime UAV tracking. However, the one-step adaptation paradigm is inadequate in addressing the prevalent difficulties posed by low-resolution (LR) objects when viewed from the UAVs at night, owing to the blurry edge contour and limited detail information. Moreover, these approaches struggle to perceive LR objects disturbed by nighttime noise. To address these challenges, this work proposes a novel progressive alignment paradigm, named domain-aware diffusion model (DaDiff), aligning nighttime LR object features to the daytime by virtue of progressive and stable generations. The proposed DaDiff includes an alignment encoder to enhance the detail information of nighttime LR objects, a tracking-oriented layer designed to achieve close collaboration with tracking tasks, and a successive distribution discriminator presented to distinguish different feature distributions at each diffusion timestep successively. Furthermore, an elaborate nighttime UAV tracking benchmark is constructed for LR objects, namely NUT-LR, consisting of 100 annotated sequences. Exhaustive experiments have demonstrated the robustness and feature alignment ability of the proposed DaDiff. The source code and video demo are available at https://github.com/vision4robotics/DaDiff.

Paper Structure

This paper contains 21 sections, 9 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Comparison of the one-step adaptation paradigm and the proposed domain-aware diffusion model, i.e., DaDiff, for nighttime UAV tracking. The feature distributions are visualized through t-SNE van2008visualizing. Green and red indicate the daytime and nighttime image feature distributions, respectively. The scattergrams depict day/night feature distributions from different feature alignment methods. DaDiff successively and steadily narrows feature distribution discrepancy, thereby achieving superior tracking results, especially for low-resolution (LR) objects.
  • Figure 2: Overview of the proposed DaDiff. Domain-aware diffusion model with alignment encoder is employed to narrow feature distribution discrepancy successively, achieving the feature alignment for nighttime UAV tracking. Tracking-oriented layer is developed to closely connect with the tracking tasks. Successive distribution discriminator is trained to distinguish features between the daytime and the nighttime gradually. Best viewed in color.
  • Figure 3: Detailed workflow of Tracking-oriented layer. With the powerful information integration ability of Transformer vaswani2017attention and the internal information exploration, Tracking-oriented layer can integrate the effective domain-aware information of aligned LR object features, closely collaborating with the tracking tasks.
  • Figure 4: Visual comparison of confidence maps generated by the Baseline, the one-step adaptation paradigm, and the proposed DaDiff. Target objects are marked by green boxes. The Baseline and the one-step adaptation paradigm struggle to extract robust LR object features in the interference of adverse illumination conditions. DaDiff stably and controllably aligns the image features by day/night domain awareness and applying the successive alignment strategy.
  • Figure 5: Typically frames of selected sequences from NUT-LR. The green boxes mark the tracked objects and the red dotted boxes are the enlarged target areas for a clear view of the tracked LR objects. While the bottom-right corner of the image displays the sequence name and the top-right one shows the frame number.
  • ...and 2 more figures