Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery

Zhicheng Zhao; Yin Huang; Lingma Sun; Chenglong Li; Jin Tang

Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery

Zhicheng Zhao, Yin Huang, Lingma Sun, Chenglong Li, Jin Tang

TL;DR

<3-5 sentence high-level summary> The paper tackles the scale gap in remote sensing object detection, where tiny objects co-occur with large structures and density varies dramatically. It introduces ScaleBridge-Det, a large-detection framework that combines a Routing-Enhanced Mixture Attention (REM) for scale-adaptive feature fusion with a Density-Guided Dynamic Query (DGQ) mechanism for density-aware query allocation, integrated into a DETR-based decoder. The authors demonstrate state-of-the-art performance on AI-TOD-v2 and DTOD, and strong cross-domain robustness on VisDrone, while maintaining balanced detection across extreme scales. These results suggest a scalable path toward robust, cross-domain tiny-to-large object detection in aerial imagery, with potential for lightweight and multi-modal extensions in practical deployments.

Abstract

Tiny object detection in remote sensing imagery has attracted significant research interest in recent years. Despite recent progress, achieving balanced detection performance across diverse object scales remains a formidable challenge, particularly in scenarios where dense tiny objects and large objects coexist. Although large foundation models have revolutionized general vision tasks, their application to tiny object detection remains unexplored due to the extreme scale variation and density distribution inherent to remote sensing imagery. To bridge this scale gap, we propose ScaleBridge-Det, to the best of our knowledge, the first large detection framework designed for tiny objects, which could achieve balanced performance across diverse scales through scale-adaptive expert routing and density-guided query allocation. Specifically, we introduce a Routing-Enhanced Mixture Attention (REM) module that dynamically selects and fuses scale-specific expert features via adaptive routing to address the tendency of standard MoE models to favor dominant scales. REM generates complementary and discriminative multi-scale representations suitable for both tiny and large objects. Furthermore, we present a Density-Guided Dynamic Query (DGQ) module that predicts object density to adaptively adjust query positions and numbers, enabling efficient resource allocation for objects of varying scales. The proposed framework allows ScaleBridge-Det to simultaneously optimize performance for both dense tiny and general objects without trade-offs. Extensive experiments on benchmark and cross-domain datasets demonstrate that ScaleBridge-Det achieves state-of-the-art performance on AI-TOD-V2 and DTOD, while exhibiting superior cross-domain robustness on VisDrone.

Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery

TL;DR

Abstract

Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)