Table of Contents
Fetching ...

Multi-task Learning for Human Settlement Extent Regression and Local Climate Zone Classification

Chunping Qiu, Lukas Liebel, Lloyd H. Hughes, Michael Schmitt, Marco Körner, Xiao Xiang Zhu

TL;DR

The concept of multitask learning (MTL) is introduced to HSE regression and LCZ classification for the first time and an MTL framework and an end-to-end convolutional neural network (CNN) are proposed, which consists of a backbone network for shared feature learning, attention modules for task-specific featurelearning, and a weighting strategy for balancing the two tasks.

Abstract

Human Settlement Extent (HSE) and Local Climate Zone (LCZ) maps are both essential sources, e.g., for sustainable urban development and Urban Heat Island (UHI) studies. Remote sensing (RS)- and deep learning (DL)-based classification approaches play a significant role by providing the potential for global mapping. However, most of the efforts only focus on one of the two schemes, usually on a specific scale. This leads to unnecessary redundancies, since the learned features could be leveraged for both of these related tasks. In this letter, the concept of multi-task learning (MTL) is introduced to HSE regression and LCZ classification for the first time. We propose a MTL framework and develop an end-to-end Convolutional Neural Network (CNN), which consists of a backbone network for shared feature learning, attention modules for task-specific feature learning, and a weighting strategy for balancing the two tasks. We additionally propose to exploit HSE predictions as a prior for LCZ classification to enhance the accuracy. The MTL approach was extensively tested with Sentinel-2 data of 13 cities across the world. The results demonstrate that the framework is able to provide a competitive solution for both tasks.

Multi-task Learning for Human Settlement Extent Regression and Local Climate Zone Classification

TL;DR

The concept of multitask learning (MTL) is introduced to HSE regression and LCZ classification for the first time and an MTL framework and an end-to-end convolutional neural network (CNN) are proposed, which consists of a backbone network for shared feature learning, attention modules for task-specific featurelearning, and a weighting strategy for balancing the two tasks.

Abstract

Human Settlement Extent (HSE) and Local Climate Zone (LCZ) maps are both essential sources, e.g., for sustainable urban development and Urban Heat Island (UHI) studies. Remote sensing (RS)- and deep learning (DL)-based classification approaches play a significant role by providing the potential for global mapping. However, most of the efforts only focus on one of the two schemes, usually on a specific scale. This leads to unnecessary redundancies, since the learned features could be leveraged for both of these related tasks. In this letter, the concept of multi-task learning (MTL) is introduced to HSE regression and LCZ classification for the first time. We propose a MTL framework and develop an end-to-end Convolutional Neural Network (CNN), which consists of a backbone network for shared feature learning, attention modules for task-specific feature learning, and a weighting strategy for balancing the two tasks. We additionally propose to exploit HSE predictions as a prior for LCZ classification to enhance the accuracy. The MTL approach was extensively tested with Sentinel-2 data of 13 cities across the world. The results demonstrate that the framework is able to provide a competitive solution for both tasks.

Paper Structure

This paper contains 15 sections, 1 equation, 4 figures, 1 table.

Figures (4)

  • Figure 1: A general MTL framework for HSE density regression and LCZ classification, consisting of a backbone network, task-specific network branches, and decoder modules. The inputs for network training are images and corresponding reference labels for each task.
  • Figure 2: Illustration of the implemented MTL CNN architecture for HSE regression and LCZ classification. The backbone network consists of two convolutional blocks, one pooling block, and two more convolutional blocks. The two task-specific network branches are indicated by two different colors. The description of each layer and the size of inputs, feature maps, and outputs are listed along with the operations. $h$ and $w$ are height and width of the input patch, and $f$ is the number of feature maps from the first layer.
  • Figure 3: Sample number of each test scene for LCZ classification assessment.
  • Figure 4: An illustration of joint prediction results in New York City (NYC), USA. From left to right are results of HSE regression and LCZ classification (with the same legend as \ref{['fig:dataTestLcz']}), and HR image overlaid with reference points (red for HSE and black for non-HSE), respectively.