Table of Contents
Fetching ...

AIO2: Online Correction of Object Labels for Deep Learning with Incomplete Annotation in Remote Sensing Image Segmentation

Chenying Liu, Conrad M Albrecht, Yi Wang, Qingyu Li, Xiao Xiang Zhu

TL;DR

This work presents adaptively triggered online object-wise label correction (AIO2) to address annotation noise induced by incomplete label sets and evaluates the robustness of AIO2 on two building footprint segmentation datasets with different spatial resolutions.

Abstract

While the volume of remote sensing data is increasing daily, deep learning in Earth Observation faces lack of accurate annotations for supervised optimization. Crowdsourcing projects such as OpenStreetMap distribute the annotation load to their community. However, such annotation inevitably generates noise due to insufficient control of the label quality, lack of annotators, frequent changes of the Earth's surface as a result of natural disasters and urban development, among many other factors. We present Adaptively trIggered Online Object-wise correction (AIO2) to address annotation noise induced by incomplete label sets. AIO2 features an Adaptive Correction Trigger (ACT) module that avoids label correction when the model training under- or overfits, and an Online Object-wise Correction (O2C) methodology that employs spatial information for automated label modification. AIO2 utilizes a mean teacher model to enhance training robustness with noisy labels to both stabilize the training accuracy curve for fitting in ACT and provide pseudo labels for correction in O2C. Moreover, O2C is implemented online without the need to store updated labels every training epoch. We validate our approach on two building footprint segmentation datasets with different spatial resolutions. Experimental results with varying degrees of building label noise demonstrate the robustness of AIO2. Source code will be available at https://github.com/zhu-xlab/AIO2.git.

AIO2: Online Correction of Object Labels for Deep Learning with Incomplete Annotation in Remote Sensing Image Segmentation

TL;DR

This work presents adaptively triggered online object-wise label correction (AIO2) to address annotation noise induced by incomplete label sets and evaluates the robustness of AIO2 on two building footprint segmentation datasets with different spatial resolutions.

Abstract

While the volume of remote sensing data is increasing daily, deep learning in Earth Observation faces lack of accurate annotations for supervised optimization. Crowdsourcing projects such as OpenStreetMap distribute the annotation load to their community. However, such annotation inevitably generates noise due to insufficient control of the label quality, lack of annotators, frequent changes of the Earth's surface as a result of natural disasters and urban development, among many other factors. We present Adaptively trIggered Online Object-wise correction (AIO2) to address annotation noise induced by incomplete label sets. AIO2 features an Adaptive Correction Trigger (ACT) module that avoids label correction when the model training under- or overfits, and an Online Object-wise Correction (O2C) methodology that employs spatial information for automated label modification. AIO2 utilizes a mean teacher model to enhance training robustness with noisy labels to both stabilize the training accuracy curve for fitting in ACT and provide pseudo labels for correction in O2C. Moreover, O2C is implemented online without the need to store updated labels every training epoch. We validate our approach on two building footprint segmentation datasets with different spatial resolutions. Experimental results with varying degrees of building label noise demonstrate the robustness of AIO2. Source code will be available at https://github.com/zhu-xlab/AIO2.git.
Paper Structure (24 sections, 13 equations, 15 figures, 7 tables, 1 algorithm)

This paper contains 24 sections, 13 equations, 15 figures, 7 tables, 1 algorithm.

Figures (15)

  • Figure 1: Flowchart of the proposed two-stage AIO2 method for object-level incomplete label sets: Model training is initially conducted using the given noisy labels, where ACT actively monitors the training dynamics to determine when to trigger O2C for label correction.
  • Figure 2: Three-stage training without special considerations for label noise (colors online): training accuracies of teacher models obtained with incomplete noisy labels of a drop rate of 0.5 on the (a) Massachusetts dataset and the (b) Germany dataset. Note: For real-world scenarios, training accuracies (blue) needs to be based on noisy labels. Ground-truth label accuracies (red) are presented for reference only. For this figure and all those that follow, statistically fluctuating accuracy curves have been smoothed, with the solid line indicating the mean value and the shaded, semi-transparent region marking the $1\sigma$-area.
  • Figure 3: Numerical exploration of memorization effects on noisy labels for segmentation tasks (colors online): We statistically analyze the model training of the (teacher) model on the Massachusetts dataset at a drop rate of 0.5. From the object-wise perspective, we divide all the objects in ground-truth (GT) masks into Marked (solid green rectangle) and Omitted (dashed orange circle), and report their (b) detection rates. An object is rated detected when it is at least partially predicted. From the pixel-wise perspective, we split all the object pixels into four groups as shown in (a), and calculate their overall accuracies (OAs) wrt (c) GT labels and (d) noisy labels. Gray background shadows highlight the transition phase.
  • Figure 4: An example demonstrating the three-stage training: (b), (c), (d)-(e) show the predictions from the early-learning, transition, and memorization stages, respectively. We list the Intersection-over-Union (IoU) wrt (f) noisy labels at the upper left corner.
  • Figure 5: Adaptive Correction Trigger (ACT) module: a three-stage strategy to identify "when" to start label correction (orange $\times$), where the non-negative number $w$ in (a) determines the window size of epochs to numerically estimate $k$, the blue faded dashed line ($--$) indicates trends in the training accuracy obtained without the application of our ACT module.
  • ...and 10 more figures