FGAA-FPN: Foreground-Guided Angle-Aware Feature Pyramid Network for Oriented Object Detection
Jialin Ma
TL;DR
FGAA-FPN introduces a foreground-guided and angle-aware feature pyramid network for oriented object detection in remote sensing imagery. By applying FGFM at low pyramid levels to strengthen object regions and AAMHA at high levels to enforce orientation-consistent feature interaction, the approach yields state-of-the-art results on DOTA v1.0 ($mAP=75.5\%$) and DOTA v1.5 ($mAP=68.3\%$). Ablation studies confirm that FGFM and AAMHA provide complementary gains and that their hierarchical placement is crucial for performance and efficiency. The method demonstrates strong generalization across detectors, improving robustness in cluttered scenes with diverse object orientations. The work highlights the value of explicit foreground priors and geometry-aware fusion in multi-scale feature representation for remote sensing detection.
Abstract
With the increasing availability of high-resolution remote sensing and aerial imagery, oriented object detection has become a key capability for geographic information updating, maritime surveillance, and disaster response. However, it remains challenging due to cluttered backgrounds, severe scale variation, and large orientation changes. Existing approaches largely improve performance through multi-scale feature fusion with feature pyramid networks or contextual modeling with attention, but they often lack explicit foreground modeling and do not leverage geometric orientation priors, which limits feature discriminability. To overcome these limitations, we propose FGAA-FPN, a Foreground-Guided Angle-Aware Feature Pyramid Network for oriented object detection. FGAA-FPN is built on a hierarchical functional decomposition that accounts for the distinct spatial resolution and semantic abstraction across pyramid levels, thereby strengthening multi-scale representations. Concretely, a Foreground-Guided Feature Modulation module learns foreground saliency under weak supervision to enhance object regions and suppress background interference in low-level features. In parallel, an Angle-Aware Multi-Head Attention module encodes relative orientation relationships to guide global interactions among high-level semantic features. Extensive experiments on DOTA v1.0 and DOTA v1.5 demonstrate that FGAA-FPN achieves state-of-the-art results, reaching 75.5% and 68.3% mAP, respectively.
