Table of Contents
Fetching ...

Infrared Small Target Detection with Scale and Location Sensitivity

Qiankun Liu, Rui Liu, Bolun Zheng, Hongkui Wang, Ying Fu

TL;DR

A novel Scale and Location Sensitive (SLS) loss is proposed to handle the limitations of existing losses and a simple Multi-Scale Head to the plain U-Net (MSHNet) is designed, which outperforms existing state-of-the-art methods by a large margin.

Abstract

Recently, infrared small target detection (IRSTD) has been dominated by deep-learning-based methods. However, these methods mainly focus on the design of complex model structures to extract discriminative features, leaving the loss functions for IRSTD under-explored. For example, the widely used Intersection over Union (IoU) and Dice losses lack sensitivity to the scales and locations of targets, limiting the detection performance of detectors. In this paper, we focus on boosting detection performance with a more effective loss but a simpler model structure. Specifically, we first propose a novel Scale and Location Sensitive (SLS) loss to handle the limitations of existing losses: 1) for scale sensitivity, we compute a weight for the IoU loss based on target scales to help the detector distinguish targets with different scales: 2) for location sensitivity, we introduce a penalty term based on the center points of targets to help the detector localize targets more precisely. Then, we design a simple Multi-Scale Head to the plain U-Net (MSHNet). By applying SLS loss to each scale of the predictions, our MSHNet outperforms existing state-of-the-art methods by a large margin. In addition, the detection performance of existing detectors can be further improved when trained with our SLS loss, demonstrating the effectiveness and generalization of our SLS loss. The code is available at https://github.com/ying-fu/MSHNet.

Infrared Small Target Detection with Scale and Location Sensitivity

TL;DR

A novel Scale and Location Sensitive (SLS) loss is proposed to handle the limitations of existing losses and a simple Multi-Scale Head to the plain U-Net (MSHNet) is designed, which outperforms existing state-of-the-art methods by a large margin.

Abstract

Recently, infrared small target detection (IRSTD) has been dominated by deep-learning-based methods. However, these methods mainly focus on the design of complex model structures to extract discriminative features, leaving the loss functions for IRSTD under-explored. For example, the widely used Intersection over Union (IoU) and Dice losses lack sensitivity to the scales and locations of targets, limiting the detection performance of detectors. In this paper, we focus on boosting detection performance with a more effective loss but a simpler model structure. Specifically, we first propose a novel Scale and Location Sensitive (SLS) loss to handle the limitations of existing losses: 1) for scale sensitivity, we compute a weight for the IoU loss based on target scales to help the detector distinguish targets with different scales: 2) for location sensitivity, we introduce a penalty term based on the center points of targets to help the detector localize targets more precisely. Then, we design a simple Multi-Scale Head to the plain U-Net (MSHNet). By applying SLS loss to each scale of the predictions, our MSHNet outperforms existing state-of-the-art methods by a large margin. In addition, the detection performance of existing detectors can be further improved when trained with our SLS loss, demonstrating the effectiveness and generalization of our SLS loss. The code is available at https://github.com/ying-fu/MSHNet.
Paper Structure (24 sections, 10 equations, 6 figures, 6 tables)

This paper contains 24 sections, 10 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Visualization of the detection performance (IoU), inference time consumption (Images/second) as well as the number of floating point of operations (area of circles) of some deep-learning-based methods. It can be seen that our MSHNet achieves a better balance between these three metrics than other methods. Results are evaluated on IRSTD-1k zhang2022isnet.
  • Figure 2: Top row: our SLS loss for the targets of different scales, where IoU loss (=0.4) and Dice loss (=0.57) have the same values for different cases. Bottom row: our SLS loss for the targets of different locations, where IoU loss (=0.3) and Dice loss (=0.43) have the same values for different cases.
  • Figure 3: Left: the value of weight $w$ in scale sensitive loss with respect to the number of predicted pixels and ground-truth pixels (i.e., predicted and ground-truth scales). Right: the normalized location sensitive loss with respect to the location error between the predicted center point and ground-truth center point. The range of [0, 100] pixels is shown for illustration.
  • Figure 4: Overview of the proposed MSHNet. Our MSHNet is implemented based on a plain U-Net without bells and whistles. Only a simple multi-scale head is introduced. For each scale, the feature map is fed into a dedicated head, producing a prediction with the same spatial shape as the feature map. Different scales of predictions are upsampled (if needed) and concatenated together to get the final prediction. In the training stage, our SLS loss is applied to each of these predictions since it is scale sensitive.
  • Figure 5: Visual comparison of detection results on several infrared images. Correctly detected targets, missed targets, and false alarms are framed by red, blue, and yellow boxes, respectively. A close-up view of the target is shown in image corners.
  • ...and 1 more figures