Table of Contents
Fetching ...

Spectral Image Data Fusion for Multisource Data Augmentation

Roberta Iuliana Luca, Alexandra Baicoianu, Ioana Cristina Plajer

TL;DR

The paper addresses the difficulty of training on spectral datasets with heterogeneous signatures and resolutions by proposing a fusion pipeline that interpolates multisource spectra to a common reference grid. It uses four interpolation methods and a Pavia University reference to harmonize wavelengths, evaluating fidelity with CMSE and NDVI alongside downstream semantic segmentation using FCNN and UNet. The study demonstrates that direct spectral alignment is feasible across six datasets and that fused data can support robust segmentation, with linear, quadratic, cubic, and PCHIP methods offering dataset-dependent trade-offs. Overall, the approach provides a practical preprocessing step for spectral data augmentation, enabling broader cross-source generalization and potential improvements in real-world remote sensing and hyperspectral analysis.

Abstract

Multispectral and hyperspectral images are increasingly popular in different research fields, such as remote sensing, astronomical imaging, or precision agriculture. However, the amount of free data available to perform machine learning tasks is relatively small. Moreover, artificial intelligence models developed in the area of spectral imaging require input images with a fixed spectral signature, expecting the data to have the same number of spectral bands or the same spectral resolution. This requirement significantly reduces the number of usable sources that can be used for a given model. The scope of this study is to introduce a methodology for spectral image data fusion, in order to allow machine learning models to be trained and/or used on data from a larger number of sources, thus providing better generalization. For this purpose, we propose different interpolation techniques, in order to make multisource spectral data compatible with each other. The interpolation outcomes are evaluated through various approaches. This includes direct assessments using surface plots and metrics such as a Custom Mean Squared Error (CMSE) and the Normalized Difference Vegetation Index (NDVI). Additionally, indirect evaluation is done by estimating their impact on machine learning model training, particularly for semantic segmentation.

Spectral Image Data Fusion for Multisource Data Augmentation

TL;DR

The paper addresses the difficulty of training on spectral datasets with heterogeneous signatures and resolutions by proposing a fusion pipeline that interpolates multisource spectra to a common reference grid. It uses four interpolation methods and a Pavia University reference to harmonize wavelengths, evaluating fidelity with CMSE and NDVI alongside downstream semantic segmentation using FCNN and UNet. The study demonstrates that direct spectral alignment is feasible across six datasets and that fused data can support robust segmentation, with linear, quadratic, cubic, and PCHIP methods offering dataset-dependent trade-offs. Overall, the approach provides a practical preprocessing step for spectral data augmentation, enabling broader cross-source generalization and potential improvements in real-world remote sensing and hyperspectral analysis.

Abstract

Multispectral and hyperspectral images are increasingly popular in different research fields, such as remote sensing, astronomical imaging, or precision agriculture. However, the amount of free data available to perform machine learning tasks is relatively small. Moreover, artificial intelligence models developed in the area of spectral imaging require input images with a fixed spectral signature, expecting the data to have the same number of spectral bands or the same spectral resolution. This requirement significantly reduces the number of usable sources that can be used for a given model. The scope of this study is to introduce a methodology for spectral image data fusion, in order to allow machine learning models to be trained and/or used on data from a larger number of sources, thus providing better generalization. For this purpose, we propose different interpolation techniques, in order to make multisource spectral data compatible with each other. The interpolation outcomes are evaluated through various approaches. This includes direct assessments using surface plots and metrics such as a Custom Mean Squared Error (CMSE) and the Normalized Difference Vegetation Index (NDVI). Additionally, indirect evaluation is done by estimating their impact on machine learning model training, particularly for semantic segmentation.
Paper Structure (25 sections, 6 equations, 16 figures, 6 tables)

This paper contains 25 sections, 6 equations, 16 figures, 6 tables.

Figures (16)

  • Figure 1: Pavia University: (a) Visualization using 3 bands; (b) Original Ground Truth; (c) Processed Ground Truth (black for unknown; green for vegetation; red for non-vegetation)
  • Figure 2: KSC image: (a) Visualization using 3 bands; (b) Original Ground Truth; (c) Processed Ground Truth (black for unknown; green for vegetation; red for non-vegetation)
  • Figure 3: Botswana image: (a) Visualization using 3 bands; (b) Original Ground Truth; (c) Processed Ground Truth (black for unknown; green for vegetation; red for non-vegetation)
  • Figure 4: Indian Pines image: (a) Visualization using 3 channels; (b) Original Ground Truth; (c) Processed Ground Truth (black for unknown; green for vegetation; red for non-vegetation)
  • Figure 5: Reference and Interpolated Pixel for CAVE Balloons
  • ...and 11 more figures