Pinwheel-shaped Convolution and Scale-based Dynamic Loss for Infrared Small Target Detection
Jiangnan Yang, Shuangli Liu, Jingjun Wu, Xinyu Su, Nan Hai, Xueli Huang
TL;DR
Infrared small target detection remains challenging due to tiny targets with low SNR and clutter. The paper introduces two plug-in contributions: pinwheel-shaped convolution (PConv) to substantially enlarge the receptive field with minimal parameter overhead, and scale-based dynamic (SD) loss to adapt the balance between scale and location losses according to target size, including SDB for bounding boxes and SDM for masks; a new large-scale dataset, SIRST-UAVB, is also proposed. Empirical results on IRSTD-1K and SIRST-UAVB across multiple detectors and segmentation models show consistent accuracy, robustness, and generalization gains, with code available at the provided repository. The approach advances IRSTDS by tailoring the convolutional module to the Gaussian spatial distribution of infrared targets and by stabilizing training through dynamic loss weighting, offering practical impact for real-world surveillance and guidance systems.
Abstract
These recent years have witnessed that convolutional neural network (CNN)-based methods for detecting infrared small targets have achieved outstanding performance. However, these methods typically employ standard convolutions, neglecting to consider the spatial characteristics of the pixel distribution of infrared small targets. Therefore, we propose a novel pinwheel-shaped convolution (PConv) as a replacement for standard convolutions in the lower layers of the backbone network. PConv better aligns with the pixel Gaussian spatial distribution of dim small targets, enhances feature extraction, significantly increases the receptive field, and introduces only a minimal increase in parameters. Additionally, while recent loss functions combine scale and location losses, they do not adequately account for the varying sensitivity of these losses across different target scales, limiting detection performance on dim-small targets. To overcome this, we propose a scale-based dynamic (SD) Loss that dynamically adjusts the influence of scale and location losses based on target size, improving the network's ability to detect targets of varying scales. We construct a new benchmark, SIRST-UAVB, which is the largest and most challenging dataset to date for real-shot single-frame infrared small target detection. Lastly, by integrating PConv and SD Loss into the latest small target detection algorithms, we achieved significant performance improvements on IRSTD-1K and our SIRST-UAVB dataset, validating the effectiveness and generalizability of our approach. Code -- https://github.com/JN-Yang/PConv-SDloss-Data
