Table of Contents
Fetching ...

Large-scale unsupervised spatio-temporal semantic analysis of vast regions from satellite images sequences

Carlos Echegoyen, Aritz Pérez, Guzmán Santafé, Unai Pérez-Goya, María Dolores Ugarte

TL;DR

The authors address the challenge of extracting large-scale, semantically meaningful land partitions from unlabeled satellite image time series. They introduce a fully unsupervised pipeline that first learns a geographic semantic embedding via Tile2Vec, represents image tiles over time as multidimensional time series, and clusters these series with K-means. A second embedding refinement, guided by clustering structure, further sharpens the partitions, enabling interpretable, region-wide semantic maps. The method, demonstrated on Sentinel-2 data over a 220 km$^2$ area in northern Spain, reveals cohesive climatic and topographic segmentation and offers a scalable, label-free approach for climate zone mapping, river basin analysis, and ecological monitoring.

Abstract

Temporal sequences of satellite images constitute a highly valuable and abundant resource for analyzing regions of interest. However, the automatic acquisition of knowledge on a large scale is a challenging task due to different factors such as the lack of precise labeled data, the definition and variability of the terrain entities, or the inherent complexity of the images and their fusion. In this context, we present a fully unsupervised and general methodology to conduct spatio-temporal taxonomies of large regions from sequences of satellite images. Our approach relies on a combination of deep embeddings and time series clustering to capture the semantic properties of the ground and its evolution over time, providing a comprehensive understanding of the region of interest. The proposed method is enhanced by a novel procedure specifically devised to refine the embedding and exploit the underlying spatio-temporal patterns. We use this methodology to conduct an in-depth analysis of a 220 km$^2$ region in northern Spain in different settings. The results provide a broad and intuitive perspective of the land where large areas are connected in a compact and well-structured manner, mainly based on climatic, phytological, and hydrological factors.

Large-scale unsupervised spatio-temporal semantic analysis of vast regions from satellite images sequences

TL;DR

The authors address the challenge of extracting large-scale, semantically meaningful land partitions from unlabeled satellite image time series. They introduce a fully unsupervised pipeline that first learns a geographic semantic embedding via Tile2Vec, represents image tiles over time as multidimensional time series, and clusters these series with K-means. A second embedding refinement, guided by clustering structure, further sharpens the partitions, enabling interpretable, region-wide semantic maps. The method, demonstrated on Sentinel-2 data over a 220 km area in northern Spain, reveals cohesive climatic and topographic segmentation and offers a scalable, label-free approach for climate zone mapping, river basin analysis, and ecological monitoring.

Abstract

Temporal sequences of satellite images constitute a highly valuable and abundant resource for analyzing regions of interest. However, the automatic acquisition of knowledge on a large scale is a challenging task due to different factors such as the lack of precise labeled data, the definition and variability of the terrain entities, or the inherent complexity of the images and their fusion. In this context, we present a fully unsupervised and general methodology to conduct spatio-temporal taxonomies of large regions from sequences of satellite images. Our approach relies on a combination of deep embeddings and time series clustering to capture the semantic properties of the ground and its evolution over time, providing a comprehensive understanding of the region of interest. The proposed method is enhanced by a novel procedure specifically devised to refine the embedding and exploit the underlying spatio-temporal patterns. We use this methodology to conduct an in-depth analysis of a 220 km region in northern Spain in different settings. The results provide a broad and intuitive perspective of the land where large areas are connected in a compact and well-structured manner, mainly based on climatic, phytological, and hydrological factors.
Paper Structure (19 sections, 3 equations, 9 figures, 2 tables)

This paper contains 19 sections, 3 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Region chosen for experiments in northern Spain, covered by 4 Sentinel-2 images
  • Figure 2: Geographical representations and 2D projections of the clustering $\mathcal{P}^g$ with different number of clusters.
  • Figure 3: Comparison between pixel-based and embedding-based clusterings for NE region. (a) Representative satellite image of the sequence. (b) Pixel-based clustering with tiles of size $8\times8$ pixels. (c) Pixel-based clustering with tiles of size $100\times100$ pixels. (d) Embedding-based clustering $\mathcal{P}^g$.
  • Figure 4: Comparison between pixel-based and embedding-based clusterings. The first column shows satellite images representatives of the sequence, the second column shows the pixel-based clusterings with tiles of size $100\times100$ pixels and the last column shows the embedding-based clusterings $\mathcal{P}^g$.
  • Figure 5: Overall clustering and legend with associated semantic tags when considering the four sequences of images.
  • ...and 4 more figures