SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small Target Detector
Qianchen Mao, Qiang Li, Bingshu Wang, Yongjun Zhang, Tao Dai, C. L. Philip Chen
TL;DR
SpirDet tackles infrared small target detection by focusing computation on sparse target locations through a Dual-branch Sparse Decoder and by employing a lightweight DO-RepEncoder with downsampling orthogonality to preserve small-object features at speed. The method achieves state-of-the-art performance with large speedups across multiple datasets, including MIoU improvements of about 4.7 percentage points on IRSTD-1Kisnet and 2.1 on NUDT-SIRSTDNANET, alongside up to ~7x faster inference. Its core ideas—sparse coarse localization followed by high-resolution sparse refinement and a reparameterized encoder—reduce memory and computation without sacrificing accuracy. These advances have practical implications for real-time infrared surveillance and rescue applications, and the authors plan to release code publicly.
Abstract
In recent years, the detection of infrared small targets using deep learning methods has garnered substantial attention due to notable advancements. To improve the detection capability of small targets, these methods commonly maintain a pathway that preserves high-resolution features of sparse and tiny targets. However, it can result in redundant and expensive computations. To tackle this challenge, we propose SpirDet, a novel approach for efficient detection of infrared small targets. Specifically, to cope with the computational redundancy issue, we employ a new dual-branch sparse decoder to restore the feature map. Firstly, the fast branch directly predicts a sparse map indicating potential small target locations (occupying only 0.5\% area of the map). Secondly, the slow branch conducts fine-grained adjustments at the positions indicated by the sparse map. Additionally, we design an lightweight DO-RepEncoder based on reparameterization with the Downsampling Orthogonality, which can effectively reduce memory consumption and inference latency. Extensive experiments show that the proposed SpirDet significantly outperforms state-of-the-art models while achieving faster inference speed and fewer parameters. For example, on the IRSTD-1K dataset, SpirDet improves $MIoU$ by 4.7 and has a $7\times$ $FPS$ acceleration compared to the previous state-of-the-art model. The code will be open to the public.
