Table of Contents
Fetching ...

Hyperspectral Vision Transformers for Greenhouse Gas Estimations from Space

Ruben Gonzalez Avilés, Linus Scheibenreif, Nassim Ait Ali Braham, Benedikt Blumenstiel, Thomas Brunschwiler, Ranjini Guruprasad, Damian Borth, Conrad Albrecht, Paolo Fraccaro, Devyani Lambhate, Johannes Jakubik

TL;DR

This work tackles the trade-off between spectral resolution and spatial/temporal coverage in satellite-based greenhouse gas monitoring by introducing a spectral transformer masked autoencoder that reconstructs hyperspectral data from multispectral inputs. Pre-trained on band-wise masked hyperspectral data, the model is fine-tuned to map multispectral inputs to synthetic hyperspectral spectra, preserving spatial coherence and enabling more informative spectral signatures for downstream gas predictions. Across reconstruction and GHG-detection tasks for CH$_4$, NO$_2$, and CO$_2$, the approach yields improved spectral fidelity and often closer-to-hyperspectral performance than multispectral baselines, with notable gains in methane detection and mixed results for NO$_2$ depending on temporal alignment. The framework demonstrates the potential to extend hyperspectral-like capabilities to widespread multispectral data, enhancing atmospheric monitoring with self-supervised learning, while highlighting data-size and temporal-misalignment challenges for certain gases.

Abstract

Hyperspectral imaging provides detailed spectral information and holds significant potential for monitoring of greenhouse gases (GHGs). However, its application is constrained by limited spatial coverage and infrequent revisit times. In contrast, multispectral imaging offers broader spatial and temporal coverage but often lacks the spectral detail that can enhance GHG detection. To address these challenges, this study proposes a spectral transformer model that synthesizes hyperspectral data from multispectral inputs. The model is pre-trained via a band-wise masked autoencoder and subsequently fine-tuned on spatio-temporally aligned multispectral-hyperspectral image pairs. The resulting synthetic hyperspectral data retain the spatial and temporal benefits of multispectral imagery and improve GHG prediction accuracy relative to using multispectral data alone. This approach effectively bridges the trade-off between spectral resolution and coverage, highlighting its potential to advance atmospheric monitoring by combining the strengths of hyperspectral and multispectral systems with self-supervised deep learning.

Hyperspectral Vision Transformers for Greenhouse Gas Estimations from Space

TL;DR

This work tackles the trade-off between spectral resolution and spatial/temporal coverage in satellite-based greenhouse gas monitoring by introducing a spectral transformer masked autoencoder that reconstructs hyperspectral data from multispectral inputs. Pre-trained on band-wise masked hyperspectral data, the model is fine-tuned to map multispectral inputs to synthetic hyperspectral spectra, preserving spatial coherence and enabling more informative spectral signatures for downstream gas predictions. Across reconstruction and GHG-detection tasks for CH, NO, and CO, the approach yields improved spectral fidelity and often closer-to-hyperspectral performance than multispectral baselines, with notable gains in methane detection and mixed results for NO depending on temporal alignment. The framework demonstrates the potential to extend hyperspectral-like capabilities to widespread multispectral data, enhancing atmospheric monitoring with self-supervised learning, while highlighting data-size and temporal-misalignment challenges for certain gases.

Abstract

Hyperspectral imaging provides detailed spectral information and holds significant potential for monitoring of greenhouse gases (GHGs). However, its application is constrained by limited spatial coverage and infrequent revisit times. In contrast, multispectral imaging offers broader spatial and temporal coverage but often lacks the spectral detail that can enhance GHG detection. To address these challenges, this study proposes a spectral transformer model that synthesizes hyperspectral data from multispectral inputs. The model is pre-trained via a band-wise masked autoencoder and subsequently fine-tuned on spatio-temporally aligned multispectral-hyperspectral image pairs. The resulting synthetic hyperspectral data retain the spatial and temporal benefits of multispectral imagery and improve GHG prediction accuracy relative to using multispectral data alone. This approach effectively bridges the trade-off between spectral resolution and coverage, highlighting its potential to advance atmospheric monitoring by combining the strengths of hyperspectral and multispectral systems with self-supervised deep learning.

Paper Structure

This paper contains 37 sections, 19 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: 3D visualization of corresponding Sentinel-2 (left) and EnMAP (right) data cubes, each spanning the same spatial dimensions along the horizontal axes and the spectral dimension along the vertical axis.
  • Figure 2: Absorption spectra and satellite band coverage. The upper section illustrates the wavelength coverage of each spectral band, with widths corresponding to the Full-Width Half-Maximum (FWHM) values extracted from the metadata of the corresponding satellite product sentinel-29324006. The lower section shows the smoothed absorption spectra for three gases, with absorption data sourced from the High Resolution Transmission (HITRAN) database and smoothed using a Gaussian filter to reduce noise for visualization gordon-2021.
  • Figure 3: High-level overview of the masked autoencoder approach for hyperspectral image reconstruction. The model applies band-wise masking to input patches, encodes the latent representations using spectral self-attention, and reconstructs the missing spectral information through a decoder.
  • Figure 4: High-level overview of the three-stage experimental setup. The model is first pre-trained on masked hyperspectral data, then fine-tuned on multispectral inputs to reconstruct hyperspectral-like data, and finally evaluated on a downstream GHG estimation task.
  • Figure 5: Original and reconstructed RGB bands (left and middle columns) for selected EnMAP images. The right column displays band-wise normalized reflectance at the location of the red dot in each RGB image, with ground-truth signatures in blue and reconstructed signatures in orange. Masked bands are highlighted in gray.
  • ...and 1 more figures