Global High Categorical Resolution Land Cover Mapping via Weak Supervision

Xin-Yi Tong; Runmin Dong; Xiao Xiang Zhu

Global High Categorical Resolution Land Cover Mapping via Weak Supervision

Xin-Yi Tong, Runmin Dong, Xiao Xiang Zhu

TL;DR

The paper tackles the challenge of global fine-grained land cover mapping under limited dense annotations by introducing PRE, a Prototype-based pseudo-label Rectification and Expansion method within weakly supervised domain adaptation. PRE uses class prototypes to connect a densely labeled source domain with a sparsely labeled target domain, enabling dynamic pseudo-label expansion and prototype updates during training. The authors present two weakly labeled city datasets (C-megacities and G-cities) and demonstrate cross-sensor, cross-category, and cross-continent WSDA across 10 cities, achieving overall accuracies above 80% and meaningful mIoU gains, especially for G-cities. This approach reduces labeling costs and advances global high-resolution land cover mapping, with broad implications for environmental monitoring and sustainable development.

Abstract

Land cover information is indispensable for advancing the United Nations' sustainable development goals, and land cover mapping under a more detailed category system would significantly contribute to economic livelihood tracking and environmental degradation measurement. However, the substantial difficulty in acquiring fine-grained training data makes the implementation of this task particularly challenging. Here, we propose to combine fully labeled source domain and weakly labeled target domain for weakly supervised domain adaptation (WSDA). This is beneficial as the utilization of sparse and coarse weak labels can considerably alleviate the labor required for precise and detailed land cover annotation. Specifically, we introduce the Prototype-based pseudo-label Rectification and Expansion (PRE) approach, which leverages the prototypes (i.e., the class-wise feature centroids) as the bridge to connect sparse labels and global feature distributions. According to the feature distances to the prototypes, the confidence of pseudo-labels predicted in the unlabeled regions of the target domain is assessed. This confidence is then utilized to guide the dynamic expansion and rectification of pseudo-labels. Based on PRE, we carry out high categorical resolution land cover mapping for 10 cities in different regions around the world, severally using PlanetScope, Gaofen-1, and Sentinel-2 satellite images. In the study areas, we achieve cross-sensor, cross-category, and cross-continent WSDA, with the overall accuracy exceeding 80%. The promising results indicate that PRE is capable of reducing the dependency of land cover classification on high-quality annotations, thereby improving label efficiency. We expect our work to enable global fine-grained land cover mapping, which in turn promote Earth observation to provide more precise and thorough information for environmental monitoring.

Global High Categorical Resolution Land Cover Mapping via Weak Supervision

TL;DR

Abstract

Paper Structure (24 sections, 11 equations, 11 figures, 5 tables)

This paper contains 24 sections, 11 equations, 11 figures, 5 tables.

Introduction
Related work
Study area and data
Overview of data
Category system
Data source
Dataset creation
Data for test
Methodology
Overview of PRE
Class prototype learning
Pseudo-label rectification and expansion
Cross-domain overall loss function
Dynamic prototype update
Experiments
...and 9 more sections

Figures (11)

Figure 1: The geographical distribution and category systems of C-megacities, G-cities, and Five-Billion-Pixels. The classes of C-megacities are identical to those of the source domain, while the classes of G-cities has been slightly adjusted. Therefore, between G-cities and the source domain, there exists cross-sensor, cross-category, and cross-continent challenges.
Figure 2: Examples of densely annotated source domain and sparsely annotated target domain. Fine delineation for a single $1000\times 1000$-pixel image with a resolution of 3 m takes approximately 1 hour. In contrast, scribbling an image with the same size and resolution takes only about 1 minute due to the avoidance of outlining boundaries.
Figure 3: Two strategies for quantitative evaluation. Sparse label: coarse blocks are evenly labeled throughout the entire test image. Dense label: sub-regions with sizes of $1000\times 1000$ pixels are labeled on each test image for every city.
Figure 4: Approach overview. We construct a dual-branch semantic segmentation model separately process $\mathcal{D}_{s}$ and $\mathcal{D}_{t}$, with both branches sharing the exact same architecture and parameters. The backbone of each branch is pre-trained on $\mathcal{D}_{s}$. The feature maps of $\mathcal{D}_{t}$ is used to initialize and update the prototypes, while the distances between prototypes and feature maps are used to refine and filter pseudo-labels. Throughout the iterations, pseudo-labels are continuously expanded, and prototypes are dynamically computed. The overall loss function consists of the domain joint segmentation loss, and the self-training loss coupled with the self-rectification loss in the target branch.
Figure 5: The predicted probabilities are rectified based on the feature distances from the prototypes. Categories belonging to prototypes closer to the sample are assigned greater weights, while categories farther from the samples are assigned smaller weights. The weighted probabilities align better with the global feature distribution.
...and 6 more figures

Global High Categorical Resolution Land Cover Mapping via Weak Supervision

TL;DR

Abstract

Global High Categorical Resolution Land Cover Mapping via Weak Supervision

Authors

TL;DR

Abstract

Table of Contents

Figures (11)