EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

Bo Yang; Xinyu Zhang; Jian Zhang; Jun Luo; Mingliang Zhou; Yangjun Pi

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

Bo Yang, Xinyu Zhang, Jian Zhang, Jun Luo, Mingliang Zhou, Yangjun Pi

TL;DR

IR small target detection suffers from extreme background-foreground imbalance and IoU-sensitive bounding-box regression. EFLNet integrates adaptive threshold focal loss (ATFL), normalized Gaussian Wasserstein distance (NWD), and a dynamic head to improve target feature learning across scales. The approach yields state-of-the-art precision, recall, and F1 on three public datasets and introduces bounding-box annotations for existing infrared datasets to support detection-based evaluation. This work enhances robustness to background clutter and enables more reliable target localization in infrared imagery with practical data annotations.

Abstract

Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target, and target information is easy to lose in the high-level semantic layer. In this article, we propose an enhancing feature learning network (EFLNet) to address these problems. First, we notice that there is an extremely imbalance between the target and the background in the infrared image, which makes the model pay more attention to the background features rather than target features. To address this problem, we propose a new adaptive threshold focal loss (ATFL) function that decouples the target and the background, and utilizes the adaptive mechanism to adjust the loss weight to force the model to allocate more attention to target features. Second, we introduce the normalized Gaussian Wasserstein distance (NWD) to alleviate the difficulty of convergence caused by the extreme sensitivity of the bounding box regression to infrared small target. Finally, we incorporate a dynamic head mechanism into the network to enable adaptive learning of the relative importance of each semantic layer. Experimental results demonstrate our method can achieve better performance in the detection performance of infrared small target compared to the state-of-the-art (SOTA) deep-learning-based methods. The source codes and bounding box annotated datasets are available at https://github.com/YangBo0411/infrared-small-target.

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

TL;DR

Abstract

Paper Structure (17 sections, 23 equations, 8 figures, 7 tables)

This paper contains 17 sections, 23 equations, 8 figures, 7 tables.

Introduction
Related work
Model-based method
Deep-learning based method
Segmentation-based methods
Detection-based methods
METHODOLOGY
Overall Architecture
Adaptive threshold focal loss
Normalized Gaussian Wasserstein distance
Dynamic head
Experiment
Dataset
Quantitative Results
Visual Results
...and 2 more sections

Figures (8)

Figure 1: Overview of the proposed EFLNet, which has the structure of backbone, FPN, PAN, and dynamic head, as well as the loss functions of NWD and ATFL.
Figure 2: The imbalance phenomenon between the target and the background.
Figure 3: Changes in losses in terms of different $\gamma$. The $p_t>0.5$ is regarded as well-classified samples.
Figure 4: Sensitivity analysis of IoU on tiny and normal scale object. (a) Tiny scale object. (b) Normal scale object.
Figure 5: Structure of dynamic head block. The $\pi_L$ denotes scale-aware attention, $\pi _S$ is spatial-aware attention, and $\pi _C$ represents task-aware attention.
...and 3 more figures

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

TL;DR

Abstract

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (8)