Table of Contents
Fetching ...

Multispectral to Hyperspectral using Pretrained Foundational model

Ruben Gonzalez, Conrad M Albrecht, Nassim Ait Ali Braham, Devyani Lambhate, Joao Lucas de Sousa Almeida, Paolo Fraccaro, Benedikt Blumenstiel, Thomas Brunschwiler, Ranjini Bangalore

TL;DR

This work tackles the challenge of limited hyperspectral revisit times by reconstructing hyperspectral data from multispectral inputs using two pretrained transformer-based approaches that leverage self-supervised masked reconstruction. The Spectral and Spatial-Spectral models are pretrained on EMIT and EnMAP data and then finetuned on spatio-temporally aligned pairs (Sentinel-2/EnMAP and HLS-S30/EMIT) to learn both spectral and spatial-spectral structure. Key contributions include two masking strategies (spectral and spatial-spectral) within a MAE-like ViT framework, detailed pretraining and finetuning datasets, and empirical evidence that spectral-spatial masking improves reconstruction quality (MSE and SSIM) and enables multispectral-to-hyperspectral translation. The findings have practical implications for atmospheric monitoring and GHG detection, potentially enabling enhanced CH4/NO2 mapping and other downstream tasks even when hyperspectral data are unavailable, with ongoing work to generalize masking temporally and across multimodal datasets.

Abstract

Hyperspectral imaging provides detailed spectral information, offering significant potential for monitoring greenhouse gases like CH4 and NO2. However, its application is constrained by limited spatial coverage and infrequent revisit times. In contrast, multispectral imaging delivers broader spatial and temporal coverage but lacks the spectral granularity required for precise GHG detection. To address these challenges, this study proposes Spectral and Spatial-Spectral transformer models that reconstruct hyperspectral data from multispectral inputs. The models in this paper are pretrained on EnMAP and EMIT datasets and fine-tuned on spatio-temporally aligned (Sentinel-2, EnMAP) and (HLS-S30, EMIT) image pairs respectively. Our model has the potential to enhance atmospheric monitoring by combining the strengths of hyperspectral and multispectral imaging systems.

Multispectral to Hyperspectral using Pretrained Foundational model

TL;DR

This work tackles the challenge of limited hyperspectral revisit times by reconstructing hyperspectral data from multispectral inputs using two pretrained transformer-based approaches that leverage self-supervised masked reconstruction. The Spectral and Spatial-Spectral models are pretrained on EMIT and EnMAP data and then finetuned on spatio-temporally aligned pairs (Sentinel-2/EnMAP and HLS-S30/EMIT) to learn both spectral and spatial-spectral structure. Key contributions include two masking strategies (spectral and spatial-spectral) within a MAE-like ViT framework, detailed pretraining and finetuning datasets, and empirical evidence that spectral-spatial masking improves reconstruction quality (MSE and SSIM) and enables multispectral-to-hyperspectral translation. The findings have practical implications for atmospheric monitoring and GHG detection, potentially enabling enhanced CH4/NO2 mapping and other downstream tasks even when hyperspectral data are unavailable, with ongoing work to generalize masking temporally and across multimodal datasets.

Abstract

Hyperspectral imaging provides detailed spectral information, offering significant potential for monitoring greenhouse gases like CH4 and NO2. However, its application is constrained by limited spatial coverage and infrequent revisit times. In contrast, multispectral imaging delivers broader spatial and temporal coverage but lacks the spectral granularity required for precise GHG detection. To address these challenges, this study proposes Spectral and Spatial-Spectral transformer models that reconstruct hyperspectral data from multispectral inputs. The models in this paper are pretrained on EnMAP and EMIT datasets and fine-tuned on spatio-temporally aligned (Sentinel-2, EnMAP) and (HLS-S30, EMIT) image pairs respectively. Our model has the potential to enhance atmospheric monitoring by combining the strengths of hyperspectral and multispectral imaging systems.

Paper Structure

This paper contains 15 sections, 1 equation, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Masking strategies
  • Figure 2: Spectral reconstruction for selected pixels for the two datasets. The masked bands are represented by Grey region and the unmasked bands are represented by White region.
  • Figure 3: Original and reconstructed masked bands for the two datasets