Table of Contents
Fetching ...

ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

Zhihui Zhang, Jinhui Pang, Jianan Li, Xiaoshuai Hao

TL;DR

A novel end-to-end framework named Enhancing Spatial Correlations in MISR (ESC-MISR), which fully exploits the spatial-temporal relations of multiple images for HR image reconstruction and introduces a novel fusion module named Multi-Image Spatial Transformer (MIST).

Abstract

Multi-Image Super-Resolution (MISR) is a crucial yet challenging research task in the remote sensing community. In this paper, we address the challenging task of Multi-Image Super-Resolution in Remote Sensing (MISR-RS), aiming to generate a High-Resolution (HR) image from multiple Low-Resolution (LR) images obtained by satellites. Recently, the weak temporal correlations among LR images have attracted increasing attention in the MISR-RS task. However, existing MISR methods treat the LR images as sequences with strong temporal correlations, overlooking spatial correlations and imposing temporal dependencies. To address this problem, we propose a novel end-to-end framework named Enhancing Spatial Correlations in MISR (ESC-MISR), which fully exploits the spatial-temporal relations of multiple images for HR image reconstruction. Specifically, we first introduce a novel fusion module named Multi-Image Spatial Transformer (MIST), which emphasizes parts with clearer global spatial features and enhances the spatial correlations between LR images. Besides, we perform a random shuffle strategy for the sequential inputs of LR images to attenuate temporal dependencies and capture weak temporal correlations in the training stage. Compared with the state-of-the-art methods, our ESC-MISR achieves 0.70dB and 0.76dB cPSNR improvements on the two bands of the PROBA-V dataset respectively, demonstrating the superiority of our method.

ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

TL;DR

A novel end-to-end framework named Enhancing Spatial Correlations in MISR (ESC-MISR), which fully exploits the spatial-temporal relations of multiple images for HR image reconstruction and introduces a novel fusion module named Multi-Image Spatial Transformer (MIST).

Abstract

Multi-Image Super-Resolution (MISR) is a crucial yet challenging research task in the remote sensing community. In this paper, we address the challenging task of Multi-Image Super-Resolution in Remote Sensing (MISR-RS), aiming to generate a High-Resolution (HR) image from multiple Low-Resolution (LR) images obtained by satellites. Recently, the weak temporal correlations among LR images have attracted increasing attention in the MISR-RS task. However, existing MISR methods treat the LR images as sequences with strong temporal correlations, overlooking spatial correlations and imposing temporal dependencies. To address this problem, we propose a novel end-to-end framework named Enhancing Spatial Correlations in MISR (ESC-MISR), which fully exploits the spatial-temporal relations of multiple images for HR image reconstruction. Specifically, we first introduce a novel fusion module named Multi-Image Spatial Transformer (MIST), which emphasizes parts with clearer global spatial features and enhances the spatial correlations between LR images. Besides, we perform a random shuffle strategy for the sequential inputs of LR images to attenuate temporal dependencies and capture weak temporal correlations in the training stage. Compared with the state-of-the-art methods, our ESC-MISR achieves 0.70dB and 0.76dB cPSNR improvements on the two bands of the PROBA-V dataset respectively, demonstrating the superiority of our method.

Paper Structure

This paper contains 16 sections, 13 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The overall framework of our ESC-MISR. It consists of three parts: encoder CNNs-Meet-Transformers (CMT), fusion module Multi-Image Spatial Transformer (MIST), and decoder Fast Fourier Convolution (FFC). Additionally, we employ the shuffle strategy to attenuate temporal dependencies in the training stage.
  • Figure 2: Qualitative evaluations. Visual comparison between different MISR-RS methods on the imgset0614 scene of the NIR band. The comparative parts are highlighted with red boxes.
  • Figure 3: Qualitative evaluations. Visual comparison between different MISR-RS methods on the imgset0024 scene of the RED band. The comparative parts are highlighted with red boxes.
  • Figure 4: Impacts of shuffle times in random shuffle on cPSNR. The figure illustrates that as the number of random shuffles increases, the performance of ESC-MISR will tend to stabilize
  • Figure 5: Impact of the input images K. cPSNR performance of the different numbers of K on SR for NIR and RED bands.