Table of Contents
Fetching ...

Remote sensing framework for geological mapping via stacked autoencoders and clustering

Sandeep Nagar, Ehsan Farahbakhsh, Joseph Awange, Rohitash Chandra

TL;DR

This study introduces an unsupervised remote sensing framework that combines stacked autoencoders for nonlinear dimensionality reduction with $k$-means clustering to produce geological maps from multispectral imagery. By evaluating Landsat 8, ASTER, and particularly Sentinel-2 data over the Mutawintji region, the authors demonstrate that stacked autoencoders yield superior latent representations, enabling finer discrimination of rock units than PCA or canonical autoencoders. The framework includes elbow-based determination of cluster count and a majority-filter post-processing step, with results validated against limited ground truth; Sentinel-2 data with stacked autoencoders achieves the highest accuracy and most detailed maps. Overall, the approach provides a scalable, unsupervised tool for regional geological mapping and can be extended to other datasets and clustering techniques, with open-source code available for replication.

Abstract

Supervised machine learning methods for geological mapping via remote sensing face limitations due to the scarcity of accurately labelled training data that can be addressed by unsupervised learning, such as dimensionality reduction and clustering. Dimensionality reduction methods have the potential to play a crucial role in improving the accuracy of geological maps. Although conventional dimensionality reduction methods may struggle with nonlinear data, unsupervised deep learning models such as autoencoders can model non-linear relationships. Stacked autoencoders feature multiple interconnected layers to capture hierarchical data representations useful for remote sensing data. We present an unsupervised machine learning-based framework for processing remote sensing data using stacked autoencoders for dimensionality reduction and k-means clustering for mapping geological units. We use Landsat 8, ASTER, and Sentinel-2 datasets to evaluate the framework for geological mapping of the Mutawintji region in Western New South Wales, Australia. We also compare stacked autoencoders with principal component analysis (PCA) and canonical autoencoders. Our results reveal that the framework produces accurate and interpretable geological maps, efficiently discriminating rock units. The results reveal that the combination of stacked autoencoders with Sentinel-2 data yields the best performance accuracy when compared to other combinations. We find that stacked autoencoders enable better extraction of complex and hierarchical representations of the input data when compared to canonical autoencoders and PCA. We also find that the generated maps align with prior geological knowledge of the study area while providing novel insights into geological structures.

Remote sensing framework for geological mapping via stacked autoencoders and clustering

TL;DR

This study introduces an unsupervised remote sensing framework that combines stacked autoencoders for nonlinear dimensionality reduction with -means clustering to produce geological maps from multispectral imagery. By evaluating Landsat 8, ASTER, and particularly Sentinel-2 data over the Mutawintji region, the authors demonstrate that stacked autoencoders yield superior latent representations, enabling finer discrimination of rock units than PCA or canonical autoencoders. The framework includes elbow-based determination of cluster count and a majority-filter post-processing step, with results validated against limited ground truth; Sentinel-2 data with stacked autoencoders achieves the highest accuracy and most detailed maps. Overall, the approach provides a scalable, unsupervised tool for regional geological mapping and can be extended to other datasets and clustering techniques, with open-source code available for replication.

Abstract

Supervised machine learning methods for geological mapping via remote sensing face limitations due to the scarcity of accurately labelled training data that can be addressed by unsupervised learning, such as dimensionality reduction and clustering. Dimensionality reduction methods have the potential to play a crucial role in improving the accuracy of geological maps. Although conventional dimensionality reduction methods may struggle with nonlinear data, unsupervised deep learning models such as autoencoders can model non-linear relationships. Stacked autoencoders feature multiple interconnected layers to capture hierarchical data representations useful for remote sensing data. We present an unsupervised machine learning-based framework for processing remote sensing data using stacked autoencoders for dimensionality reduction and k-means clustering for mapping geological units. We use Landsat 8, ASTER, and Sentinel-2 datasets to evaluate the framework for geological mapping of the Mutawintji region in Western New South Wales, Australia. We also compare stacked autoencoders with principal component analysis (PCA) and canonical autoencoders. Our results reveal that the framework produces accurate and interpretable geological maps, efficiently discriminating rock units. The results reveal that the combination of stacked autoencoders with Sentinel-2 data yields the best performance accuracy when compared to other combinations. We find that stacked autoencoders enable better extraction of complex and hierarchical representations of the input data when compared to canonical autoencoders and PCA. We also find that the generated maps align with prior geological knowledge of the study area while providing novel insights into geological structures.
Paper Structure (12 sections, 8 figures, 3 tables)

This paper contains 12 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: a) The Curnamona Province and other Proterozoic terrains in Australia barovich2008tectonic; the study area has been shown using a black square. b) Simplified geological map of the study area.
  • Figure 2: False colour composite images generated using a) Landsat 8 (RGB 753), b) ASTER (RGB 321), and c) Sentinel-2 (RGB 843) data.
  • Figure 3: a) The architecture of a canonical autoencoder consists of an encoder and a decoder. The encoder takes the multispectral image as input ($x$) and reduces the dimension to the latent vector ($z$), where dim($x$) $>=$ dim($z$). The decoder reconstructs the image from the latent vector ($Z$). b) A stacked autoencoder with three encoders and decoders for each stacking level. Each stack level's encoder and decoder architecture is the same as the canonical autoencoder, with a number of hidden layers for each.
  • Figure 4: The visualization of the dimensionality reduction for a multispectral dataset. The number of input spectral bands $n$ is reduced to $m$ in the output dataset. Each coloured layer in the input image and the output represents a spectral band and a component, respectively.
  • Figure 5: Machine learning framework for creating geological maps using the integration of the dimensionality reduction (PCA, canonical autoencoder, and stacked autoencoder) methods and the $k$-means clustering.
  • ...and 3 more figures