Table of Contents
Fetching ...

GEWDiff: Geometric Enhanced Wavelet-based Diffusion Model for Hyperspectral Image Super-resolution

Sirui Wang, Jiang He, Natàlia Blasco Andreo, Xiao Xiang Zhu

TL;DR

GEWDiff presents a 4× hyperspectral image super-resolution framework that integrates a wavelet-based encoder–decoder with a geometry-enhanced diffusion process. An RWA+PCA-based encoder compresses HSIs into a compact latent, while a geometry-aware diffusion module, edge-aware noise scheduling, and mask conditioning preserve geometric structures and spectral fidelity. A multi-level loss balances pixel-level accuracy, perceptual similarity, and gradient consistency to accelerate stable convergence. Experiments on EnMAP Campaign and MDAS demonstrate state-of-the-art fidelity, spectral accuracy, and cross-dataset robustness, with practical benefits for downstream land-cover tasks and real-world satellite data fusion. The work highlights the benefit of coupling spectral-domain compression with geometric priors to enable fast, high-quality hyperspectral SR at scale.

Abstract

Improving the quality of hyperspectral images (HSIs), such as through super-resolution, is a crucial research area. However, generative modeling for HSIs presents several challenges. Due to their high spectral dimensionality, HSIs are too memory-intensive for direct input into conventional diffusion models. Furthermore, general generative models lack an understanding of the topological and geometric structures of ground objects in remote sensing imagery. In addition, most diffusion models optimize loss functions at the noise level, leading to a non-intuitive convergence behavior and suboptimal generation quality for complex data. To address these challenges, we propose a Geometric Enhanced Wavelet-based Diffusion Model (GEWDiff), a novel framework for reconstructing hyperspectral images at 4-times super-resolution. A wavelet-based encoder-decoder is introduced that efficiently compresses HSIs into a latent space while preserving spectral-spatial information. To avoid distortion during generation, we incorporate a geometry-enhanced diffusion process that preserves the geometric features. Furthermore, a multi-level loss function was designed to guide the diffusion process, promoting stable convergence and improved reconstruction fidelity. Our model demonstrated state-of-the-art results across multiple dimensions, including fidelity, spectral accuracy, visual realism, and clarity.

GEWDiff: Geometric Enhanced Wavelet-based Diffusion Model for Hyperspectral Image Super-resolution

TL;DR

GEWDiff presents a 4× hyperspectral image super-resolution framework that integrates a wavelet-based encoder–decoder with a geometry-enhanced diffusion process. An RWA+PCA-based encoder compresses HSIs into a compact latent, while a geometry-aware diffusion module, edge-aware noise scheduling, and mask conditioning preserve geometric structures and spectral fidelity. A multi-level loss balances pixel-level accuracy, perceptual similarity, and gradient consistency to accelerate stable convergence. Experiments on EnMAP Campaign and MDAS demonstrate state-of-the-art fidelity, spectral accuracy, and cross-dataset robustness, with practical benefits for downstream land-cover tasks and real-world satellite data fusion. The work highlights the benefit of coupling spectral-domain compression with geometric priors to enable fast, high-quality hyperspectral SR at scale.

Abstract

Improving the quality of hyperspectral images (HSIs), such as through super-resolution, is a crucial research area. However, generative modeling for HSIs presents several challenges. Due to their high spectral dimensionality, HSIs are too memory-intensive for direct input into conventional diffusion models. Furthermore, general generative models lack an understanding of the topological and geometric structures of ground objects in remote sensing imagery. In addition, most diffusion models optimize loss functions at the noise level, leading to a non-intuitive convergence behavior and suboptimal generation quality for complex data. To address these challenges, we propose a Geometric Enhanced Wavelet-based Diffusion Model (GEWDiff), a novel framework for reconstructing hyperspectral images at 4-times super-resolution. A wavelet-based encoder-decoder is introduced that efficiently compresses HSIs into a latent space while preserving spectral-spatial information. To avoid distortion during generation, we incorporate a geometry-enhanced diffusion process that preserves the geometric features. Furthermore, a multi-level loss function was designed to guide the diffusion process, promoting stable convergence and improved reconstruction fidelity. Our model demonstrated state-of-the-art results across multiple dimensions, including fidelity, spectral accuracy, visual realism, and clarity.

Paper Structure

This paper contains 35 sections, 22 equations, 15 figures, 6 tables.

Figures (15)

  • Figure 1: Illustration of the Geometric Enhanced Wavelet-based Diffusion Model pipeline.
  • Figure 2: Illustration of the wavelet-based encoder-decoder.
  • Figure 3: Edge perturbed noisy image over time.
  • Figure 4: 4-times visual comparisons with SOTA SR models on (a) MDAS sample 1, (b) MDAS sample 2, and (c) WDC dataset.
  • Figure 5: Spectral profile of a random pixel and mean difference value in each band of (a–b) MDAS sample 1, (c–d) MDAS sample 2, and (e–f) WDC dataset.
  • ...and 10 more figures