Table of Contents
Fetching ...

A Sinkhorn Regularized Adversarial Network for Image Guided DEM Super-resolution using Frequency Selective Hybrid Graph Transformer

Subhajit Paul, Ashutosh Gupta

TL;DR

This work addresses the generation of high-resolution DEMs using HR multi-spectral satellite imagery as a guide by introducing a novel hybrid transformer model consisting of Densely connected Multi-Residual Block (DMRB) and multi-headed Frequency Selective Graph Attention (M-FSGA).

Abstract

Digital Elevation Model (DEM) is an essential aspect in the remote sensing (RS) domain to analyze various applications related to surface elevations. Here, we address the generation of high-resolution (HR) DEMs using HR multi-spectral (MX) satellite imagery as a guide by introducing a novel hybrid transformer model consisting of Densely connected Multi-Residual Block (DMRB) and multi-headed Frequency Selective Graph Attention (M-FSGA). To promptly regulate this process, we utilize the notion of discriminator spatial maps as the conditional attention to the MX guide. Further, we present a novel adversarial objective related to optimizing Sinkhorn distance with classical GAN. In this regard, we provide both theoretical and empirical substantiation of better performance in terms of vanishing gradient issues and numerical convergence. Based on our experiments on 4 different DEM datasets, we demonstrate both qualitative and quantitative comparisons with available baseline methods and show that the performance of our proposed model is superior to others with sharper details and minimal errors.

A Sinkhorn Regularized Adversarial Network for Image Guided DEM Super-resolution using Frequency Selective Hybrid Graph Transformer

TL;DR

This work addresses the generation of high-resolution DEMs using HR multi-spectral satellite imagery as a guide by introducing a novel hybrid transformer model consisting of Densely connected Multi-Residual Block (DMRB) and multi-headed Frequency Selective Graph Attention (M-FSGA).

Abstract

Digital Elevation Model (DEM) is an essential aspect in the remote sensing (RS) domain to analyze various applications related to surface elevations. Here, we address the generation of high-resolution (HR) DEMs using HR multi-spectral (MX) satellite imagery as a guide by introducing a novel hybrid transformer model consisting of Densely connected Multi-Residual Block (DMRB) and multi-headed Frequency Selective Graph Attention (M-FSGA). To promptly regulate this process, we utilize the notion of discriminator spatial maps as the conditional attention to the MX guide. Further, we present a novel adversarial objective related to optimizing Sinkhorn distance with classical GAN. In this regard, we provide both theoretical and empirical substantiation of better performance in terms of vanishing gradient issues and numerical convergence. Based on our experiments on 4 different DEM datasets, we demonstrate both qualitative and quantitative comparisons with available baseline methods and show that the performance of our proposed model is superior to others with sharper details and minimal errors.
Paper Structure (24 sections, 5 theorems, 43 equations, 19 figures, 4 tables)

This paper contains 24 sections, 5 theorems, 43 equations, 19 figures, 4 tables.

Key Result

Theorem 1

Consider $\mathcal{S}_{C,\varepsilon}(\mu_\theta, \nu)$ be the Sin-khorn loss between measures $\mu_\theta$ and $\nu$ on $\mathcal{X}$ and $\mathcal{Y}$, two bounded subsets of $\mathbb{R}^{d}$, with a $\mathcal{C}^{\infty}$, $L_0$-Lipschitz, and $L_1$-smooth cost function $C$. Then, for $(\theta_1, where $L$ is the Lipschitz in $\theta$, $\kappa = 2(L_0 |\mathcal{X}| + ||C||_{\infty})$, $B = d.\m

Figures (19)

  • Figure 1: Two sample results of DEM SR consisting HR FCC of NIR(R), R(G), and G(B), Bicubic interpolated LR DEM, and Generated HR DEM, respectively.
  • Figure 2: Overview of proposed framework. (a) The generator $G$ have multiple HTBs with parallelly connected (c) DMRB and (d) FSGT. Given guide $\mathbf{z}$ and upsampled LR DEM $\mathbf{\tilde{x}}$ to $G$, each HTB extracts global selective frequency information by FSGT and dense local features via DMRBs in latent space. (b) The discriminator $D$ consists of only DMRBs. Besides classifying predicted $\mathbf{\hat{y}}$ and GT $\mathbf{y}$ as real or fake, $D$ also estimates DSA $\mathbf{D_{SA}}$ with input $\mathbf{\tilde{x}}$. $\mathbf{D_{SA}}$ is passed through a PSA 29 block to estimate $\mathbf{A_s}$ which acted as spatial attention for HR guide $\mathbf{z}$ during passing it to $G$ along $\mathbf{\tilde{x}}$.
  • Figure 3: Workflow of FSGT, (a) graph construction mechanism, (b) FSGA block
  • Figure 4: Test results (inside India) for DEM super-resolution (better viewed at 200%) and comparisons with other baseline methods.
  • Figure 5: Test results (outside India) for DEM super-resolution (better viewed at 200%) and comparisons with other baseline methods.
  • ...and 14 more figures

Theorems & Definitions (10)

  • Theorem 1: Smoothness of Sinkhorn loss
  • proof
  • proposition thmcounterproposition
  • proof
  • proposition thmcounterproposition
  • proof
  • corollary thmcountercorollary
  • proof
  • lemma thmcounterlemma
  • proof