Hybrid Mask Generation for Infrared Small Target Detection with Single-Point Supervision
Weijie He, Mushui Liu, Yunlong Yu
TL;DR
This work tackles infrared small target detection with single-point supervision by introducing Hybrid Mask Generation (HMG), which combines a learning-free, size-adaptive Point-to-Mask Generation (PMG) with a learning-based updating framework. PMG converts point annotations into bounding boxes (Point-to-Box) and then into initial pseudo masks (Box-to-Mask) using region-based thresholds and directional probability maps, while Pseudo Mask Updating (FAF and MDR) merges neural predictions with these masks to produce high-quality, hybrid supervision. Across three SIRST datasets, HMG achieves IoU gains over prior single-point methods and approaches fully supervised performance in some configurations, while maintaining low false-alarm rates and robust performance under real-world perturbations. The method reduces annotation costs and provides a practical, scalable solution for IRSTD, with strong potential for deployment in real systems. The key contributions are the size-aware PMG, the FAF/MDR updating scheme, and extensive ablations validating robustness to hyperparameters and cropping choices.
Abstract
Single-frame infrared small target (SIRST) detection poses a significant challenge due to the requirement to discern minute targets amidst complex infrared background clutter. In this paper, we focus on a weakly-supervised paradigm to obtain high-quality pseudo masks from the point-level annotation by integrating a novel learning-free method with the hybrid of the learning-based method. The learning-free method adheres to a sequential process, progressing from a point annotation to the bounding box that encompasses the target, and subsequently to detailed pseudo masks, while the hybrid is achieved through filtering out false alarms and retrieving missed detections in the network's prediction to provide a reliable supplement for learning-free masks. The experimental results show that our learning-free method generates pseudo masks with an average Intersection over Union (IoU) that is 4.3% higher than the second-best learning-free competitor across three datasets, while the hybrid learning-based method further enhances the quality of pseudo masks, achieving an additional average IoU increase of 3.4%.
