Table of Contents
Fetching ...

Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels

Guozhang Liu, Ting Liu, Mengke Yuan, Tao Pang, Guangxing Yang, Hao Fu, Tao Wang, Tongkui Liao

TL;DR

This work tackles category-label noise in oriented remote sensing object detection (ORSOD) by introducing Dynamic Loss Decay (DLD), a training strategy guided by two-phase learning dynamics. A key insight is the End Point of Early-learning ($EL$), identified from the second derivative of accuracy curves, which signals when memorization of noisy labels begins; after $EL$, DLD down-weights the top-$K$ losses using a dynamic factor $\alpha=\exp\big(10/(\text{EC}-EL)\big)$ to suppress harmful gradient updates. The approach is validated on HRSC2016 and DOTA-v1.0/v2.0 with synthetic label noise, showing robust improvements over baselines and compatibility with multiple ORSOD architectures, including achieving notable performance in the 2023 NBDCIC challenge. The results demonstrate that focusing loss contributions on likely correct labels during memorization significantly reduces degradation from category-noise and improves fine-grained remote-sensing detection performance. $EL$ and $\alpha$ play pivotal roles in controlling the transition and decay strength, and DLD offers a practical, plug‑and‑play mechanism for robust ORSOD under label noise.

Abstract

The ambiguous appearance, tiny scale, and fine-grained classes of objects in remote sensing imagery inevitably lead to the noisy annotations in category labels of detection dataset. However, the effects and treatments of the label noises are underexplored in modern oriented remote sensing object detectors. To address this issue, we propose a robust oriented remote sensing object detection method through dynamic loss decay (DLD) mechanism, inspired by the two phase ``early-learning'' and ``memorization'' learning dynamics of deep neural networks on clean and noisy samples. To be specific, we first observe the end point of early learning phase termed as EL, after which the models begin to memorize the false labels that significantly degrade the detection accuracy. Secondly, under the guidance of the training indicator, the losses of each sample are ranked in descending order, and we adaptively decay the losses of the top K largest ones (bad samples) in the following epochs. Because these large losses are of high confidence to be calculated with wrong labels. Experimental results show that the method achieves excellent noise resistance performance tested on multiple public datasets such as HRSC2016 and DOTA-v1.0/v2.0 with synthetic category label noise. Our solution also has won the 2st place in the "fine-grained object detection based on sub-meter remote sensing imagery" track with noisy labels of 2023 National Big Data and Computing Intelligence Challenge.

Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels

TL;DR

This work tackles category-label noise in oriented remote sensing object detection (ORSOD) by introducing Dynamic Loss Decay (DLD), a training strategy guided by two-phase learning dynamics. A key insight is the End Point of Early-learning (), identified from the second derivative of accuracy curves, which signals when memorization of noisy labels begins; after , DLD down-weights the top- losses using a dynamic factor to suppress harmful gradient updates. The approach is validated on HRSC2016 and DOTA-v1.0/v2.0 with synthetic label noise, showing robust improvements over baselines and compatibility with multiple ORSOD architectures, including achieving notable performance in the 2023 NBDCIC challenge. The results demonstrate that focusing loss contributions on likely correct labels during memorization significantly reduces degradation from category-noise and improves fine-grained remote-sensing detection performance. and play pivotal roles in controlling the transition and decay strength, and DLD offers a practical, plug‑and‑play mechanism for robust ORSOD under label noise.

Abstract

The ambiguous appearance, tiny scale, and fine-grained classes of objects in remote sensing imagery inevitably lead to the noisy annotations in category labels of detection dataset. However, the effects and treatments of the label noises are underexplored in modern oriented remote sensing object detectors. To address this issue, we propose a robust oriented remote sensing object detection method through dynamic loss decay (DLD) mechanism, inspired by the two phase ``early-learning'' and ``memorization'' learning dynamics of deep neural networks on clean and noisy samples. To be specific, we first observe the end point of early learning phase termed as EL, after which the models begin to memorize the false labels that significantly degrade the detection accuracy. Secondly, under the guidance of the training indicator, the losses of each sample are ranked in descending order, and we adaptively decay the losses of the top K largest ones (bad samples) in the following epochs. Because these large losses are of high confidence to be calculated with wrong labels. Experimental results show that the method achieves excellent noise resistance performance tested on multiple public datasets such as HRSC2016 and DOTA-v1.0/v2.0 with synthetic category label noise. Our solution also has won the 2st place in the "fine-grained object detection based on sub-meter remote sensing imagery" track with noisy labels of 2023 National Big Data and Computing Intelligence Challenge.
Paper Structure (17 sections, 3 equations, 11 figures, 7 tables)

This paper contains 17 sections, 3 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: The first row illustrates the difficulty of annotating the fine-grained types of planes in remote sensing images with similar appearances. The second row shows one example image with correct reference ground truth annotations in DOTA-v1.0 dataset on the left. The red and blue oriented bounding boxes indicate "plane" and "helicopter". We train baseline ORSOD model and baseline + DLD(ours) with synthesized 30% noisy category labels. Their detection results are visualized in the middle and right respectively, baseline result contains several false classification instances. Baseline with DLD generates more accurate results.
  • Figure 2: The dynamics of two measurements mean average precision (mAP) and top 1 accuracy (ACC), acquired by the Oriented R-CNN with LSKNet-Tiny backbone. The experiments are conducted on DOTA-v1.0 dataset contaminated with different level of category noises (20%, 30%, and 40%). The mAP is calculated between model output and clean GT category labels. The ACC of the model output is referenced with noisy category labels. The star in (b)(d) represents the early-learning endpoint.
  • Figure 3: The overview training process of DLD. The bottom part illustrates structure of an Oriented Object Detector, the upper part shows the conceptual illustration of DLD based on early-learning stage and memorization stage theory.
  • Figure 4: The more elaborated curves of mAPC (mAP with respect to correct category labels) and mAPI (mAP with respect to incorrect category labels) of training set. The Orient R-CNN model is trained with labels with 40% noise level. The curve of mAPI stays below 2% while mAPC continuously improves during whole process.
  • Figure 5: The ACC curves of ORSOD model training with four different strategies and labels with 20% noise. The four training strategies are training Oriented R-CNN not using DLD, using DLD and the loss decay begins at epoch EL-4 (8), EL (12), and EL+4 (16).
  • ...and 6 more figures