Table of Contents
Fetching ...

Guidance Disentanglement Network for Optics-Guided Thermal UAV Image Super-Resolution

Zhicheng Zhao, Juanjuan Gu, Chenglong Li, Chun Wang, Zhongling Huang, Jin Tang

TL;DR

A novel Guidance Disentanglement network (GDNet) is proposed, which disentangles the optical image representation according to typical UAV scenario attributes to form guidance features under both favorable and adverse conditions, for robust OTUAV-SR.

Abstract

Optics-guided Thermal UAV image Super-Resolution (OTUAV-SR) has attracted significant research interest due to its potential applications in security inspection, agricultural measurement, and object detection. Existing methods often employ single guidance model to generate the guidance features from optical images to assist thermal UAV images super-resolution. However, single guidance models make it difficult to generate effective guidance features under favorable and adverse conditions in UAV scenarios, thus limiting the performance of OTUAV-SR. To address this issue, we propose a novel Guidance Disentanglement network (GDNet), which disentangles the optical image representation according to typical UAV scenario attributes to form guidance features under both favorable and adverse conditions, for robust OTUAV-SR. Moreover, we design an attribute-aware fusion module to combine all attribute-based optical guidance features, which could form a more discriminative representation and fit the attribute-agnostic guidance process. To facilitate OTUAV-SR research in complex UAV scenarios, we introduce VGTSR2.0, a large-scale benchmark dataset containing 3,500 aligned optical-thermal image pairs captured under diverse conditions and scenes. Extensive experiments on VGTSR2.0 demonstrate that GDNet significantly improves OTUAV-SR performance over state-of-the-art methods, especially in the challenging low-light and foggy environments commonly encountered in UAV scenarios. The dataset and code will be publicly available at https://github.com/Jocelyney/GDNet.

Guidance Disentanglement Network for Optics-Guided Thermal UAV Image Super-Resolution

TL;DR

A novel Guidance Disentanglement network (GDNet) is proposed, which disentangles the optical image representation according to typical UAV scenario attributes to form guidance features under both favorable and adverse conditions, for robust OTUAV-SR.

Abstract

Optics-guided Thermal UAV image Super-Resolution (OTUAV-SR) has attracted significant research interest due to its potential applications in security inspection, agricultural measurement, and object detection. Existing methods often employ single guidance model to generate the guidance features from optical images to assist thermal UAV images super-resolution. However, single guidance models make it difficult to generate effective guidance features under favorable and adverse conditions in UAV scenarios, thus limiting the performance of OTUAV-SR. To address this issue, we propose a novel Guidance Disentanglement network (GDNet), which disentangles the optical image representation according to typical UAV scenario attributes to form guidance features under both favorable and adverse conditions, for robust OTUAV-SR. Moreover, we design an attribute-aware fusion module to combine all attribute-based optical guidance features, which could form a more discriminative representation and fit the attribute-agnostic guidance process. To facilitate OTUAV-SR research in complex UAV scenarios, we introduce VGTSR2.0, a large-scale benchmark dataset containing 3,500 aligned optical-thermal image pairs captured under diverse conditions and scenes. Extensive experiments on VGTSR2.0 demonstrate that GDNet significantly improves OTUAV-SR performance over state-of-the-art methods, especially in the challenging low-light and foggy environments commonly encountered in UAV scenarios. The dataset and code will be publicly available at https://github.com/Jocelyney/GDNet.

Paper Structure

This paper contains 23 sections, 14 equations, 19 figures, 8 tables.

Figures (19)

  • Figure 1: Uncertainty of the optical images in foggy and low-light conditions. We compare the performance of the baseline SwinIR, single image super-resolution methods such as Restormer and HAT, as well as existing guided super-resolution methods, including UGSR, CENet, and MGNet, under low-light and foggy conditions. Our method effectively integrates information from both modalities, leading to superior results.
  • Figure 2: The framework of the proposed guidance disentanglement SR method consists of three stages: Stage 1 involves data acquisition and degradation; Stage 2 focuses on training GDNet using the degraded data; and Stage 3 tests the model by inputting data and evaluating the SR results.
  • Figure 3: The overall structure of the proposed GDNet. The Attribute-specific Guidance Module (AGM) consists of three branches: NC, LI, and FO, which represent attribute-specific branches for normal conditions, low illumination, and fog obstruction, respectively. To aggregate features from these branches, we introduce the Attribute-aware Fusion Module (AFM), which facilitates the adaptive aggregation of the outputs. Additionally, the MOGM enhances feature representation by integrating both optical and thermal information.
  • Figure 4: The network structures of Backbone and MGL. Optical images are input into NC and FO, where they first perform shallow feature extraction and size reduction through Backbone, followed by the MGL module.
  • Figure 5: The network structure of GAL. We use a gating mechanism to filter fog-related features from foggy optical images and interact with fog-resistant thermal images to generate more discriminative optical features.
  • ...and 14 more figures