Table of Contents
Fetching ...

Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network

Zhaoyang Wang, Dongyang Li, Mingyang Zhang, Hao Luo, Maoguo Gong

TL;DR

This work tackles hyperspectral image super-resolution (HSI SR) where diffusion models struggle with high spectral dimensionality. It introduces DMGASR, a two-stage framework that first trains a Group Autoencoder to map an HSI into a compact latent list $[Z^1_{HR}, Z^2_{HR}, \

Abstract

Existing hyperspectral image (HSI) super-resolution (SR) methods struggle to effectively capture the complex spectral-spatial relationships and low-level details, while diffusion models represent a promising generative model known for their exceptional performance in modeling complex relations and learning high and low-level visual features. The direct application of diffusion models to HSI SR is hampered by challenges such as difficulties in model convergence and protracted inference time. In this work, we introduce a novel Group-Autoencoder (GAE) framework that synergistically combines with the diffusion model to construct a highly effective HSI SR model (DMGASR). Our proposed GAE framework encodes high-dimensional HSI data into low-dimensional latent space where the diffusion model works, thereby alleviating the difficulty of training the diffusion model while maintaining band correlation and considerably reducing inference time. Experimental results on both natural and remote sensing hyperspectral datasets demonstrate that the proposed method is superior to other state-of-the-art methods both visually and metrically.

Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network

TL;DR

This work tackles hyperspectral image super-resolution (HSI SR) where diffusion models struggle with high spectral dimensionality. It introduces DMGASR, a two-stage framework that first trains a Group Autoencoder to map an HSI into a compact latent list $[Z^1_{HR}, Z^2_{HR}, \

Abstract

Existing hyperspectral image (HSI) super-resolution (SR) methods struggle to effectively capture the complex spectral-spatial relationships and low-level details, while diffusion models represent a promising generative model known for their exceptional performance in modeling complex relations and learning high and low-level visual features. The direct application of diffusion models to HSI SR is hampered by challenges such as difficulties in model convergence and protracted inference time. In this work, we introduce a novel Group-Autoencoder (GAE) framework that synergistically combines with the diffusion model to construct a highly effective HSI SR model (DMGASR). Our proposed GAE framework encodes high-dimensional HSI data into low-dimensional latent space where the diffusion model works, thereby alleviating the difficulty of training the diffusion model while maintaining band correlation and considerably reducing inference time. Experimental results on both natural and remote sensing hyperspectral datasets demonstrate that the proposed method is superior to other state-of-the-art methods both visually and metrically.
Paper Structure (18 sections, 5 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 5 equations, 9 figures, 5 tables, 1 algorithm.

Figures (9)

  • Figure 1: Our proposed framework combines three key techniques: spectral grouping and fusing techniques, autoencoder techniques and diffusion-based SR network.
  • Figure 2: Overview of the proposed model, In Stage 1, the autoencoder is trained to encode the input data into a series of hidden variables ($[Z_{HR}^1,Z_{HR}^2 \cdots Z_{HR}^n]$). In Stgae 2, the diffusion model is trained. The grouped data ($G_{HR}^i$ and $G_{LR}^i$) are first encoded, generating hidden variables ($Z_{HR}^i$ and $Z_{LR}^i$) and the $z_{LR}^i$ is added as conditional information by directly concatenating it with the hidden variables ($Z_{SR,t}^i$) at each moment during the denoising process.
  • Figure 3: Qualitative results of different models at scale 4 with the corresponding error maps of the PaviaC dataset. The false-color image is used for clear visualization (red: 100, green: 30, and blue: 10).
  • Figure 4: Qualitative results of different models at scale 4 with the corresponding error maps of the Chikusei dataset. The false-color image is used for clear visualization (red: 70, green: 100, and blue: 36).
  • Figure 5: Qualitative results of different models at scale 4 with the corresponding error maps of the Harvard dataset. The false-color image is used for clear visualization (red: 25, green: 15, and blue: 2).
  • ...and 4 more figures