HCF-Net: Hierarchical Context Fusion Network for Infrared Small Object Detection
Shibiao Xu, ShuChen Zheng, Wenhao Xu, Rongtao Xu, Changwei Wang, Jiguang Zhang, Xiaoqiang Teng, Ao Li, Li Guo
TL;DR
HCF-Net tackles infrared small object detection by preserving multi-scale context through a triad of modules: Parallelized Patch-Aware Attention (PPA) for multi-branch patch-based features, Dimension-Aware Selective Integration (DASI) for adaptive skip-connection fusion, and Multi-Dilated Channel Refiner (MDCR) for multi-scale channel refinement. Framing the task as semantic segmentation and trained from scratch, the approach emphasizes robust feature preservation and targeted fusion to combat small-object loss and background clutter. Experimental validation on the SIRST dataset demonstrates state-of-the-art IoU and nIoU performance, with ablations confirming the contributions of each module. The method offers practical advances for infrared surveillance and related domains where tiny targets must be detected in challenging backgrounds.
Abstract
Infrared small object detection is an important computer vision task involving the recognition and localization of tiny objects in infrared images, which usually contain only a few pixels. However, it encounters difficulties due to the diminutive size of the objects and the generally complex backgrounds in infrared images. In this paper, we propose a deep learning method, HCF-Net, that significantly improves infrared small object detection performance through multiple practical modules. Specifically, it includes the parallelized patch-aware attention (PPA) module, dimension-aware selective integration (DASI) module, and multi-dilated channel refiner (MDCR) module. The PPA module uses a multi-branch feature extraction strategy to capture feature information at different scales and levels. The DASI module enables adaptive channel selection and fusion. The MDCR module captures spatial features of different receptive field ranges through multiple depth-separable convolutional layers. Extensive experimental results on the SIRST infrared single-frame image dataset show that the proposed HCF-Net performs well, surpassing other traditional and deep learning models. Code is available at https://github.com/zhengshuchen/HCFNet.
