Table of Contents
Fetching ...

AquaCluster: Using Satellite Images And Self-supervised Machine Learning Networks To Detect Water Hidden Under Vegetation

Ioannis Iakovidis, Zahra Kalantari, Amir Hossein Payberah, Fernando Jaramillo, Francisco Pena Escobar

TL;DR

The paper addresses detecting surface water in wetlands obscured by vegetation using radar imagery without labeled data. It introduces AquaCluster, a self-supervised framework that combines deep clustering and negative sampling on radar data, featuring a modified U-Net encoder and a lightweight classifier, plus an ensemble for robustness. Key findings show the single AquaCluster model surpassing baselines (e.g., IoU up to $0.85$) and the ensemble reaching $IoU$ of $0.89$, with higher recall and precision across Swedish wetlands. The approach demonstrates annotation-free retraining feasibility, enabling easier adaptation to different climates or sensor configurations for scalable wetland monitoring.

Abstract

In recent years, the wide availability of high-resolution radar satellite images has enabled the remote monitoring of wetland surface areas. Machine learning models have achieved state-of-the-art results in segmenting wetlands from satellite images. However, these models require large amounts of manually annotated satellite images, which are slow and expensive to produce. The need for annotated training data makes it difficult to adapt these models to changes such as different climates or sensors. To address this issue, we employed self-supervised training methods to develop a model, AquaCluster, which segments radar satellite images into water and land areas without manual annotations. Our final model outperformed other radar-based water detection techniques that do not require annotated data in our test dataset, having achieved a 0.08 improvement in the Intersection over Union metric. Our results demonstrate that it is possible to train machine learning models to detect vegetated water from radar images without the use of annotated data, which can make the retraining of these models to account for changes much easier.

AquaCluster: Using Satellite Images And Self-supervised Machine Learning Networks To Detect Water Hidden Under Vegetation

TL;DR

The paper addresses detecting surface water in wetlands obscured by vegetation using radar imagery without labeled data. It introduces AquaCluster, a self-supervised framework that combines deep clustering and negative sampling on radar data, featuring a modified U-Net encoder and a lightweight classifier, plus an ensemble for robustness. Key findings show the single AquaCluster model surpassing baselines (e.g., IoU up to ) and the ensemble reaching of , with higher recall and precision across Swedish wetlands. The approach demonstrates annotation-free retraining feasibility, enabling easier adaptation to different climates or sensor configurations for scalable wetland monitoring.

Abstract

In recent years, the wide availability of high-resolution radar satellite images has enabled the remote monitoring of wetland surface areas. Machine learning models have achieved state-of-the-art results in segmenting wetlands from satellite images. However, these models require large amounts of manually annotated satellite images, which are slow and expensive to produce. The need for annotated training data makes it difficult to adapt these models to changes such as different climates or sensors. To address this issue, we employed self-supervised training methods to develop a model, AquaCluster, which segments radar satellite images into water and land areas without manual annotations. Our final model outperformed other radar-based water detection techniques that do not require annotated data in our test dataset, having achieved a 0.08 improvement in the Intersection over Union metric. Our results demonstrate that it is possible to train machine learning models to detect vegetated water from radar images without the use of annotated data, which can make the retraining of these models to account for changes much easier.

Paper Structure

This paper contains 34 sections, 6 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Comparison of methods using optical and radar satellite images for water detection used on a wetland partially covered by vegetation. We can see that the method based on optical data cannot detect the water hidden under vegetation. On the other hand the radar image contains a lot of noise which necessitates a more complicated classification method. Modified from pena2023deepaqua.
  • Figure 2: The training algorithm. Three groups of tiles are sampled from the training image (image tiles, augmented image tiles and augmented image shuffled tiles). Each of these groups of tiles is processed by an encoding sub-model and the prediction sub-model to produce tiles of pixel-level class probabilities. Using the class probability tiles and two different loss algorithms (per-pixel cross-entropy and per-pixel absolute difference), the four losses used to train the sub-models (two deep clustering losses, a positive spatial consistency loss and a negative spatial consistency loss) are computed. Sub-models with the same color share parameters.
  • Figure 3: The standard path of the training algorithm. Each image tile is processed by the encoding sub-model, which outputs tiles containing an $N_{enc}$-sized encoding vector for each pixel. These tiles are then processed by the prediction sub-model, which outputs tiles with class probabilities for each pixel. For each pixel the class with the highest probability is treated as its label, which is used in the deep clustering loss. The class probabilities of each pixel are also compared with the class probabilities of pixels from the other two paths of the training algorithm to compute the contrastive losses.
  • Figure 4: Examples of segmentation when using models with two and four model classes. The first row shows the input radar images, the second row shows the ground truth water/land segmentations, the third row shows the model segmentations and the fourth row shows the land/water segmentations produced by post-processing the model segmentations. As can be seen in the left sub-figure, while the model using two model classes produces a logical segmentation of the input image into two classes, these two classes do not correspond to the water and land ground truth classes. Since both of the model classes end up having a greater overlap with the ground truth land class in the test dataset, they are both assigned to the land class during post-processing. As can be seen in the right sub-figure, the model using four model classes produces segmentations that better separate land from water, although even for this model large parts of land are grouped with water areas.
  • Figure 5: test results of the AquaCluster model over ten different experiments, along with the mean test result of those experiments and the test result of the ensemble of those models. The performance of the single AquaCluster models varies widely between experiments. The ensemble of those models achieves significantly better performance than the mean of the individual models.
  • ...and 1 more figures