Table of Contents
Fetching ...

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

Bo Yang, Xinyu Zhang, Jian Zhang, Jun Luo, Mingliang Zhou, Yangjun Pi

TL;DR

IR small target detection suffers from extreme background-foreground imbalance and IoU-sensitive bounding-box regression. EFLNet integrates adaptive threshold focal loss (ATFL), normalized Gaussian Wasserstein distance (NWD), and a dynamic head to improve target feature learning across scales. The approach yields state-of-the-art precision, recall, and F1 on three public datasets and introduces bounding-box annotations for existing infrared datasets to support detection-based evaluation. This work enhances robustness to background clutter and enables more reliable target localization in infrared imagery with practical data annotations.

Abstract

Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target, and target information is easy to lose in the high-level semantic layer. In this article, we propose an enhancing feature learning network (EFLNet) to address these problems. First, we notice that there is an extremely imbalance between the target and the background in the infrared image, which makes the model pay more attention to the background features rather than target features. To address this problem, we propose a new adaptive threshold focal loss (ATFL) function that decouples the target and the background, and utilizes the adaptive mechanism to adjust the loss weight to force the model to allocate more attention to target features. Second, we introduce the normalized Gaussian Wasserstein distance (NWD) to alleviate the difficulty of convergence caused by the extreme sensitivity of the bounding box regression to infrared small target. Finally, we incorporate a dynamic head mechanism into the network to enable adaptive learning of the relative importance of each semantic layer. Experimental results demonstrate our method can achieve better performance in the detection performance of infrared small target compared to the state-of-the-art (SOTA) deep-learning-based methods. The source codes and bounding box annotated datasets are available at https://github.com/YangBo0411/infrared-small-target.

EFLNet: Enhancing Feature Learning for Infrared Small Target Detection

TL;DR

IR small target detection suffers from extreme background-foreground imbalance and IoU-sensitive bounding-box regression. EFLNet integrates adaptive threshold focal loss (ATFL), normalized Gaussian Wasserstein distance (NWD), and a dynamic head to improve target feature learning across scales. The approach yields state-of-the-art precision, recall, and F1 on three public datasets and introduces bounding-box annotations for existing infrared datasets to support detection-based evaluation. This work enhances robustness to background clutter and enables more reliable target localization in infrared imagery with practical data annotations.

Abstract

Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target, and target information is easy to lose in the high-level semantic layer. In this article, we propose an enhancing feature learning network (EFLNet) to address these problems. First, we notice that there is an extremely imbalance between the target and the background in the infrared image, which makes the model pay more attention to the background features rather than target features. To address this problem, we propose a new adaptive threshold focal loss (ATFL) function that decouples the target and the background, and utilizes the adaptive mechanism to adjust the loss weight to force the model to allocate more attention to target features. Second, we introduce the normalized Gaussian Wasserstein distance (NWD) to alleviate the difficulty of convergence caused by the extreme sensitivity of the bounding box regression to infrared small target. Finally, we incorporate a dynamic head mechanism into the network to enable adaptive learning of the relative importance of each semantic layer. Experimental results demonstrate our method can achieve better performance in the detection performance of infrared small target compared to the state-of-the-art (SOTA) deep-learning-based methods. The source codes and bounding box annotated datasets are available at https://github.com/YangBo0411/infrared-small-target.
Paper Structure (17 sections, 23 equations, 8 figures, 7 tables)

This paper contains 17 sections, 23 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Overview of the proposed EFLNet, which has the structure of backbone, FPN, PAN, and dynamic head, as well as the loss functions of NWD and ATFL.
  • Figure 2: The imbalance phenomenon between the target and the background.
  • Figure 3: Changes in losses in terms of different $\gamma$. The $p_t>0.5$ is regarded as well-classified samples.
  • Figure 4: Sensitivity analysis of IoU on tiny and normal scale object. (a) Tiny scale object. (b) Normal scale object.
  • Figure 5: Structure of dynamic head block. The $\pi_L$ denotes scale-aware attention, $\pi _S$ is spatial-aware attention, and $\pi _C$ represents task-aware attention.
  • ...and 3 more figures