DATransNet: Dynamic Attention Transformer Network for Infrared Small Target Detection

Chen Hu; Yian Huang; Kexuan Li; Luping Zhang; Chang Long; Yiming Zhu; Tian Pu; Zhenming Peng

DATransNet: Dynamic Attention Transformer Network for Infrared Small Target Detection

Chen Hu, Yian Huang, Kexuan Li, Luping Zhang, Chang Long, Yiming Zhu, Tian Pu, Zhenming Peng

TL;DR

The dynamic attention transformer network (DATransNet) is proposed, which aims to extract and preserve detailed information vital for small targets, and a global feature extraction module (GFEM) that offers a comprehensive perspective to prevent the network from focusing solely on details while neglecting the global information.

Abstract

Infrared small target detection (ISTD) is widely used in civilian and military applications. However, ISTD encounters several challenges, including the tendency for small and dim targets to be obscured by complex backgrounds. To address this issue, we propose the Dynamic Attention Transformer Network (DATransNet), which aims to extract and preserve detailed information vital for small targets. DATransNet employs the Dynamic Attention Transformer (DATrans), simulating central difference convolutions (CDC) to extract gradient features. Furthermore, we propose a global feature extraction module (GFEM) that offers a comprehensive perspective to prevent the network from focusing solely on details while neglecting the global information. We compare the network with state-of-the-art (SOTA) approaches and demonstrate that our method performs effectively. Our source code is available at https://github.com/greekinRoma/DATransNet.

DATransNet: Dynamic Attention Transformer Network for Infrared Small Target Detection

TL;DR

Abstract

Paper Structure (16 sections, 10 equations, 7 figures, 4 tables)

This paper contains 16 sections, 10 equations, 7 figures, 4 tables.

Introduction
Methods
Dynamic Attention Transformer (DATrans)
Global Feature Extraction Module (GFEM)
Loss Function
Experiment and Analysis
Dataset and Evaluation Metrics
Experimental Environment and Parameter Settings
Ablation Study
Studies of Module-wise Performance Gain
Studies of Dilation Rate for DATrans
Studies of GFEM
Comparsion with State-of-the-art (SOTA) Methods
Qualitative Results
Quantitative Results
...and 1 more sections

Figures (7)

Figure 1: The overall structure of DATransNet. The red represents the upsampling stage (UpStage), and the blue corresponds to the downsampling stage (Stage).
Figure 2: The first column is the original image, and the following 8 columns are the results of the original image using different edge convolutions across various directions.
Figure 3: $T^n$ is derived by applying edge detection convolutions with a dilation ratio of $n$ to the image $I$, followed by flattening the output.
Figure 4: The overall structure of DATrans.
Figure 5: The structure of the GFEM using the Non-local (Non-local Attention Module) to capture global spatial features and SE Block (Squeeze-and-Excitation Block) to extract global channel information.
...and 2 more figures

DATransNet: Dynamic Attention Transformer Network for Infrared Small Target Detection

TL;DR

Abstract

DATransNet: Dynamic Attention Transformer Network for Infrared Small Target Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (7)