Table of Contents
Fetching ...

Transformer for Multitemporal Hyperspectral Image Unmixing

Hang Li, Qiankun Dong, Xueshuo Xie, Xia Xu, Tao Li, Zhenwei Shi

TL;DR

This work tackles multitemporal hyperspectral image unmixing (MTHU) by introducing MUFormer, an end-to-end unsupervised transformer framework. It uses a CNN encoder followed by a Global Awareness Module (GAM) to capture global temporal–spatial–spectral dependencies and a Change Enhancement Module (CEM) to model fine-grained changes between adjacent time phases, with a phase-wise linear decoder to estimate endmembers. The model is trained with a composite loss $L = \beta L_{RE} + \gamma L_{SAD} + \lambda L_E$, balancing reconstruction quality, spectral fidelity, and endmember consistency. Experiments on a real Lake Tahoe sequence and two synthetic datasets show state-of-the-art performance in abundance and endmember estimation, demonstrating strong potential for robust multitemporal unmixing and long-term surface monitoring, while hinting at future directions for denoising and more subtle change handling.

Abstract

Multitemporal hyperspectral image unmixing (MTHU) holds significant importance in monitoring and analyzing the dynamic changes of surface. However, compared to single-temporal unmixing, the multitemporal approach demands comprehensive consideration of information across different phases, rendering it a greater challenge. To address this challenge, we propose the Multitemporal Hyperspectral Image Unmixing Transformer (MUFormer), an end-to-end unsupervised deep learning model. To effectively perform multitemporal hyperspectral image unmixing, we introduce two key modules: the Global Awareness Module (GAM) and the Change Enhancement Module (CEM). The Global Awareness Module computes self-attention across all phases, facilitating global weight allocation. On the other hand, the Change Enhancement Module dynamically learns local temporal changes by comparing endmember changes between adjacent phases. The synergy between these modules allows for capturing semantic information regarding endmember and abundance changes, thereby enhancing the effectiveness of multitemporal hyperspectral image unmixing. We conducted experiments on one real dataset and two synthetic datasets, demonstrating that our model significantly enhances the effect of multitemporal hyperspectral image unmixing.

Transformer for Multitemporal Hyperspectral Image Unmixing

TL;DR

This work tackles multitemporal hyperspectral image unmixing (MTHU) by introducing MUFormer, an end-to-end unsupervised transformer framework. It uses a CNN encoder followed by a Global Awareness Module (GAM) to capture global temporal–spatial–spectral dependencies and a Change Enhancement Module (CEM) to model fine-grained changes between adjacent time phases, with a phase-wise linear decoder to estimate endmembers. The model is trained with a composite loss , balancing reconstruction quality, spectral fidelity, and endmember consistency. Experiments on a real Lake Tahoe sequence and two synthetic datasets show state-of-the-art performance in abundance and endmember estimation, demonstrating strong potential for robust multitemporal unmixing and long-term surface monitoring, while hinting at future directions for denoising and more subtle change handling.

Abstract

Multitemporal hyperspectral image unmixing (MTHU) holds significant importance in monitoring and analyzing the dynamic changes of surface. However, compared to single-temporal unmixing, the multitemporal approach demands comprehensive consideration of information across different phases, rendering it a greater challenge. To address this challenge, we propose the Multitemporal Hyperspectral Image Unmixing Transformer (MUFormer), an end-to-end unsupervised deep learning model. To effectively perform multitemporal hyperspectral image unmixing, we introduce two key modules: the Global Awareness Module (GAM) and the Change Enhancement Module (CEM). The Global Awareness Module computes self-attention across all phases, facilitating global weight allocation. On the other hand, the Change Enhancement Module dynamically learns local temporal changes by comparing endmember changes between adjacent phases. The synergy between these modules allows for capturing semantic information regarding endmember and abundance changes, thereby enhancing the effectiveness of multitemporal hyperspectral image unmixing. We conducted experiments on one real dataset and two synthetic datasets, demonstrating that our model significantly enhances the effect of multitemporal hyperspectral image unmixing.
Paper Structure (13 sections, 17 equations, 9 figures, 4 tables)

This paper contains 13 sections, 17 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Illustration of the structure of the MUFormer model. Among them, GAM stands for Global Awareness Module, and CBM stands for Change Enhancement Module, as shown in the lower left corner of the illustration.
  • Figure 2: Global Awareness Module, Step1 represents the temporal self-attention calculation process, Step2 represents the spatial self-attention calculation process, Step3 represents the shared MLP module, and Step4 represents the spectral self-attention calculation process. Specifically, in Step1 and Step2, red represents the current query patch, blue represents the patches that participate in the attention computation, and yellow represents the patches that do not participate in the computation.
  • Figure 3: Lake Tahoe hyperspectral image sequence, acquisition time from left to right are 04/10/2014, 06/02/2014, 09/19/2014, 11/17/2014, 04/29/2015, 10/13/2015, respectively.
  • Figure 4: Multitemporal abundance map of Lake Tahoe HIs, with water, soil, and vegetation endmembers from left to right.
  • Figure 5: The endmember estimation results for synthetic data 1, where the first row represents the true endmember results and the second row is the result of our model.
  • ...and 4 more figures