Table of Contents
Fetching ...

Pick of the Bunch: Detecting Infrared Small Targets Beyond Hit-Miss Trade-Offs via Selective Rank-Aware Attention

Yimian Dai, Peiwen Pan, Yulei Qian, Yuxuan Li, Xiang Li, Jian Yang, Huan Wang

TL;DR

This work proposes SeRankDet, a deep network that achieves high accuracy beyond the conventional hit-miss trade-off, by following the “Pick of the Bunch” principle, and sets new benchmarks in state-of-the-art performance across multiple public datasets.

Abstract

Infrared small target detection faces the inherent challenge of precisely localizing dim targets amidst complex background clutter. Traditional approaches struggle to balance detection precision and false alarm rates. To break this dilemma, we propose SeRankDet, a deep network that achieves high accuracy beyond the conventional hit-miss trade-off, by following the ``Pick of the Bunch'' principle. At its core lies our Selective Rank-Aware Attention (SeRank) module, employing a non-linear Top-K selection process that preserves the most salient responses, preventing target signal dilution while maintaining constant complexity. Furthermore, we replace the static concatenation typical in U-Net structures with our Large Selective Feature Fusion (LSFF) module, a dynamic fusion strategy that empowers SeRankDet with adaptive feature integration, enhancing its ability to discriminate true targets from false alarms. The network's discernment is further refined by our Dilated Difference Convolution (DDC) module, which merges differential convolution aimed at amplifying subtle target characteristics with dilated convolution to expand the receptive field, thereby substantially improving target-background separation. Despite its lightweight architecture, the proposed SeRankDet sets new benchmarks in state-of-the-art performance across multiple public datasets. The code is available at https://github.com/GrokCV/SeRankDet.

Pick of the Bunch: Detecting Infrared Small Targets Beyond Hit-Miss Trade-Offs via Selective Rank-Aware Attention

TL;DR

This work proposes SeRankDet, a deep network that achieves high accuracy beyond the conventional hit-miss trade-off, by following the “Pick of the Bunch” principle, and sets new benchmarks in state-of-the-art performance across multiple public datasets.

Abstract

Infrared small target detection faces the inherent challenge of precisely localizing dim targets amidst complex background clutter. Traditional approaches struggle to balance detection precision and false alarm rates. To break this dilemma, we propose SeRankDet, a deep network that achieves high accuracy beyond the conventional hit-miss trade-off, by following the ``Pick of the Bunch'' principle. At its core lies our Selective Rank-Aware Attention (SeRank) module, employing a non-linear Top-K selection process that preserves the most salient responses, preventing target signal dilution while maintaining constant complexity. Furthermore, we replace the static concatenation typical in U-Net structures with our Large Selective Feature Fusion (LSFF) module, a dynamic fusion strategy that empowers SeRankDet with adaptive feature integration, enhancing its ability to discriminate true targets from false alarms. The network's discernment is further refined by our Dilated Difference Convolution (DDC) module, which merges differential convolution aimed at amplifying subtle target characteristics with dilated convolution to expand the receptive field, thereby substantially improving target-background separation. Despite its lightweight architecture, the proposed SeRankDet sets new benchmarks in state-of-the-art performance across multiple public datasets. The code is available at https://github.com/GrokCV/SeRankDet.
Paper Structure (25 sections, 12 equations, 9 figures, 7 tables)

This paper contains 25 sections, 12 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: The indispensable role of large background context in infrared small target detection. Merely relying on the cropped region surrounding the target is insufficient to distinguish genuine targets from visually similar false alarms, highlighting the necessity of incorporating extensive contextual information.
  • Figure 2: Schematic overview of the proposed SeRankDet network. SeRankDet incorporates three specially designed modules: DDC, SeRank, and LSFF. For simplicity, only three out of the five stages are depicted.
  • Figure 3: Illustration of the proposed DDC module: Synergizing convolutions to enhance the capture of intricate details while simultaneously suppresses background noise through its broad receptive field.
  • Figure 4: Illustration of the proposed SeRank module, which amplifies target features with Top-K operation. It preserves pivotal target features against complex backgrounds while maintaining constant computational complexity.
  • Figure 5: Architecture of the proposed LSFF module: advancing beyond static concatenation in U-Net with dynamic selective feature fusion for better true positive discrimination in the SeRankDet framework.
  • ...and 4 more figures