Table of Contents
Fetching ...

RRCANet: Recurrent Reusable-Convolution Attention Network for Infrared Small Target Detection

Yongxian Liu, Boyang Li, Ting Liu, Zaiping Lin, Wei An

TL;DR

This work tackles infrared small target detection by introducing RRCA-Net, a lightweight network that achieves high precision with minimal parameters. It leverages a recurrent reusable-convolution block (RuCB) to iteratively refine features without adding new kernels, and a dual interactive attention aggregation module (DIAAM) to fuse multi-scale information efficiently. The loss function, DpT-k, combines Dice, Poly, and Top-k terms to promote shape accuracy and focus on hard samples, improving convergence and reducing false alarms. Evaluations on NUAA-SIRST, IRSTD-1k, and DenseSIRST demonstrate competitive accuracy and robustness, with the added benefit of transferability as a plug-and-play component for other IRSTD methods.

Abstract

Infrared small target detection is a challenging task due to its unique characteristics (e.g., small, dim, shapeless and changeable). Recently published CNN-based methods have achieved promising performance with heavy feature extraction and fusion modules. To achieve efficient and effective detection, we propose a recurrent reusable-convolution attention network (RRCA-Net) for infrared small target detection. Specifically, RRCA-Net incorporates reusable-convolution block (RuCB) in a recurrent manner without introducing extra parameters. With the help of the repetitive iteration in RuCB, the high-level information of small targets in the deep layers can be well maintained and further refined. Then, a dual interactive attention aggregation module (DIAAM) is proposed to promote the mutual enhancement and fusion of refined information. In this way, RRCA-Net can both achieve high-level feature refinement and enhance the correlation of contextual information between adjacent layers. Moreover, to achieve steady convergence, we design a target characteristic inspired loss function (DpT-k loss) by integrating physical and mathematical constraints. Experimental results on three benchmark datasets (e.g. NUAA-SIRST, IRSTD-1k, DenseSIRST) demonstrate that our RRCA-Net can achieve comparable performance to the state-of-the-art methods while maintaining a small number of parameters, and act as a plug and play module to introduce consistent performance improvement for several popular IRSTD methods.

RRCANet: Recurrent Reusable-Convolution Attention Network for Infrared Small Target Detection

TL;DR

This work tackles infrared small target detection by introducing RRCA-Net, a lightweight network that achieves high precision with minimal parameters. It leverages a recurrent reusable-convolution block (RuCB) to iteratively refine features without adding new kernels, and a dual interactive attention aggregation module (DIAAM) to fuse multi-scale information efficiently. The loss function, DpT-k, combines Dice, Poly, and Top-k terms to promote shape accuracy and focus on hard samples, improving convergence and reducing false alarms. Evaluations on NUAA-SIRST, IRSTD-1k, and DenseSIRST demonstrate competitive accuracy and robustness, with the added benefit of transferability as a plug-and-play component for other IRSTD methods.

Abstract

Infrared small target detection is a challenging task due to its unique characteristics (e.g., small, dim, shapeless and changeable). Recently published CNN-based methods have achieved promising performance with heavy feature extraction and fusion modules. To achieve efficient and effective detection, we propose a recurrent reusable-convolution attention network (RRCA-Net) for infrared small target detection. Specifically, RRCA-Net incorporates reusable-convolution block (RuCB) in a recurrent manner without introducing extra parameters. With the help of the repetitive iteration in RuCB, the high-level information of small targets in the deep layers can be well maintained and further refined. Then, a dual interactive attention aggregation module (DIAAM) is proposed to promote the mutual enhancement and fusion of refined information. In this way, RRCA-Net can both achieve high-level feature refinement and enhance the correlation of contextual information between adjacent layers. Moreover, to achieve steady convergence, we design a target characteristic inspired loss function (DpT-k loss) by integrating physical and mathematical constraints. Experimental results on three benchmark datasets (e.g. NUAA-SIRST, IRSTD-1k, DenseSIRST) demonstrate that our RRCA-Net can achieve comparable performance to the state-of-the-art methods while maintaining a small number of parameters, and act as a plug and play module to introduce consistent performance improvement for several popular IRSTD methods.

Paper Structure

This paper contains 35 sections, 15 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Comparison between our RRCANet and recently published methods in term of IoU, FLOPs and number of parameters. Larger circle represents more parameters.
  • Figure 2: Representation of small targets in CNN with recurrent iteration of (a) RRCA-Net without iteration (b) RRCA-Net with n=2.
  • Figure 3: An illustration of the proposed RRCA-Net. Feature extraction: Images are first input into the RuCB of encoder to extract multi-layer features. Feature aggregation and enhancement: Subsequently, in the decoder, the extracted features are upsampled and fused with a DIAAM. Repeated feature refinement: The multi-layer features are concatenated to achieve robust output refinement after several encoding and decoding iterations. Finally, eight-connected neighborhood clustering algorithm clusters the segmentation map to locate the centroid of each target region.
  • Figure 4: Visual analysis of DpT-k loss and others. (a) Curves of loss. (b) Gradient curves of loss.
  • Figure 5: ROC performance of different methods on (a) IRSTD-1k, (b) NUAA-SIRST and (c) DenseSIRST datasets, respectively. Our RRCA-Net achieves consistent superior performance under different thresholds.
  • ...and 7 more figures