PRNet: Original Information Is All You Have
PeiHuang Zheng, Yunlong Zhao, Zheng Cui, Yang Li
TL;DR
PRNet tackles information loss in small-object detection for aerial imagery by preserving shallow spatial features throughout the network. It introduces the Progressive Refinement Neck (PRN) to iteratively refine high-resolution features via backbone feature reuse and progressive fusion, and Enhanced SliceSamp (ESSamp) to mitigate detail degradation during downsampling using PixelUnShuffle and augmented depthwise convolution with a depth multiplier. Across VisDrone, AI-TOD, and UAVDT, PRNet achieves superior accuracy under realistic computational budgets and shows robust gains in ablations, demonstrating both improved detail preservation and effective multi-scale fusion. The framework offers a practical, real-time solution for precise aerial object detection with strong generalization across detectors and model scales.
Abstract
Small object detection in aerial images suffers from severe information degradation during feature extraction due to limited pixel representations, where shallow spatial details fail to align effectively with semantic information, leading to frequent misses and false positives. Existing FPN-based methods attempt to mitigate these losses through post-processing enhancements, but the reconstructed details often deviate from the original image information, impeding their fusion with semantic content. To address this limitation, we propose PRNet, a real-time detection framework that prioritizes the preservation and efficient utilization of primitive shallow spatial features to enhance small object representations. PRNet achieves this via two modules:the Progressive Refinement Neck (PRN) for spatial-semantic alignment through backbone reuse and iterative refinement, and the Enhanced SliceSamp (ESSamp) for preserving shallow information during downsampling via optimized rearrangement and convolution. Extensive experiments on the VisDrone, AI-TOD, and UAVDT datasets demonstrate that PRNet outperforms state-of-the-art methods under comparable computational constraints, achieving superior accuracy-efficiency trade-offs.
