Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels

Zhuohong Li; Wei He; Jiepan Li; Fangxiao Lu; Hongyan Zhang

Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels

Zhuohong Li, Wei He, Jiepan Li, Fangxiao Lu, Hongyan Zhang

TL;DR

This work tackles updating large-scale high-resolution land-cover maps when only low-resolution historical labels are available. It introduces Paraformer, a weakly supervised framework that fuses a resolution-preserving CNN branch with a Transformer-based global modeling branch, augmented by the PLAT module to refine LR labels into reliable supervision. The approach optimizes with L_total = L_ce + L_mce, where L_mce is computed via Mask-Cross-Entropy on iteratively refined mask labels, enabling end-to-end training without HR labels. Experiments on Chesapeake Bay and Poland datasets show Paraformer outperforms state-of-the-art methods in mIoU across diverse LR label scenarios, demonstrating robust HR map updating across wide-spread landforms. The results suggest Paraformer’s practical potential for scalable, accurate HR land-cover updates using readily available LR historical data.

Abstract

Large-scale high-resolution (HR) land-cover mapping is a vital task to survey the Earth's surface and resolve many challenges facing humanity. However, it is still a non-trivial task hindered by complex ground details, various landforms, and the scarcity of accurate training labels over a wide-span geographic area. In this paper, we propose an efficient, weakly supervised framework (Paraformer) to guide large-scale HR land-cover mapping with easy-access historical land-cover data of low resolution (LR). Specifically, existing land-cover mapping approaches reveal the dominance of CNNs in preserving local ground details but still suffer from insufficient global modeling in various landforms. Therefore, we design a parallel CNN-Transformer feature extractor in Paraformer, consisting of a downsampling-free CNN branch and a Transformer branch, to jointly capture local and global contextual information. Besides, facing the spatial mismatch of training data, a pseudo-label-assisted training (PLAT) module is adopted to reasonably refine LR labels for weakly supervised semantic segmentation of HR images. Experiments on two large-scale datasets demonstrate the superiority of Paraformer over other state-of-the-art methods for automatically updating HR land-cover maps from LR historical labels.

Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels

TL;DR

Abstract

Paper Structure (12 sections, 4 equations, 9 figures, 3 tables)

This paper contains 12 sections, 4 equations, 9 figures, 3 tables.

Introduction
Related Work
Methodology
CNN-based resolution-preserving branch
Transformer-based global-modeling branch
Pseudo-Label-Assisted Training module
Experiments
Study areas and using data
Implementation Detail and Metrics
Comparison Results
Ablation experiments
Conclusion

Figures (9)

Figure 1: Illustration of resolution mismatched issue in using the HR remote-sensing image (Source) and LR historical labels (Guide) to generate HR land-cover results (Target).
Figure 2: Two modes of large-scale HR land-cover mapping with LR labels. (a) Existing modes either reply on partial HR labels or require non-end-to-end training with human interventions. (b) Paraformer aims to form a mode that is HR-label-free and end-to-end trainable.
Figure 3: Overall workflow of Paraformer. The framework only takes the HR images and LR labels as training input and includes three components: (a) CNN-based resolution-preserving branch, (b) Transformer-based global-modeling branch, and (c) Pseudo-Label-Assisted Training (PLAT) module.
Figure 4: Example of the local mismatch/match in two regions. The edge of water is marked with yellow boundaries. Region 1 shows dispersed lakes around urban areas with unmatched annotation. Region 2 shows a large-scale river with matched annotation.
Figure 5: Demonstration of the training data and visual comparisons of the Paraformer and other typical methods on the Chesapeake Bay dataset with 16 classes. (a) HR image. (b) LR label. (c) land-cover mapping result of Parafomer. (d–h) land-cover mapping results of five typical methods.
...and 4 more figures

Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels

TL;DR

Abstract

Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels

Authors

TL;DR

Abstract

Table of Contents

Figures (9)