LiM-YOLO: Less is More with Pyramid Level Shift and Normalized Auxiliary Branch for Ship Detection in Optical Remote Sensing Imagery
Seon-Hoon Kim, Hyeji Sim, Youeyun Jung, Ok-Chul Jung, Yerin Kim
TL;DR
LiM-YOLO tackles the core problem of detecting small, elongated ships in optical remote sensing by identifying a scale-mismatch between ships and the YOLO detection head. It introduces a Pyramid Level Shift to use P2-P4 (adding a high-resolution P2 head) and prune P5, coupled with a Group Normalized Auxiliary Branch (GN-CBLinear) to stabilize training on micro-batches. Across four diverse datasets (SODA-A, DOTA-v1.5, FAIR1M-v2.0, ShipRSImageNet-V1), LiM-YOLO achieves state-of-the-art accuracy with substantially fewer parameters and GFLOPs, validating the effectiveness of domain-specific architectural alignment. The approach improves small-ship localization, reduces background noise, and offers practical efficiency for onboard maritime surveillance systems.
Abstract
Applying general-purpose object detectors to ship detection in satellite imagery presents significant challenges due to the extreme scale disparity and morphological anisotropy of maritime targets. Standard architectures utilizing stride-32 (P5) layers often fail to resolve narrow vessels, resulting in spatial feature dilution. In this work, we propose LiM-YOLO, a specialized detector designed to resolve these domain-specific conflicts. Based on a statistical analysis of ship scales, we introduce a Pyramid Level Shift Strategy that reconfigures the detection head to P2-P4. This shift ensures compliance with Nyquist sampling criteria for small objects while eliminating the computational redundancy of deep layers. To further enhance training stability on high-resolution inputs, we incorporate a Group Normalized Convolutional Block for Linear Projection (GN-CBLinear), which mitigates gradient volatility in micro-batch settings. Validated on SODA-A, DOTA-v1.5, FAIR1M-v2.0, and ShipRSImageNet-V1, LiM-YOLO demonstrates superior detection accuracy and efficiency compared to state-of-the-art models. The code is available at https://github.com/egshkim/LiM-YOLO.
