Table of Contents
Fetching ...

Satellite Image Time Series Semantic Change Detection: Novel Architecture and Analysis of Domain Shift

Elliot Vincent, Jean Ponce, Mathieu Aubry

TL;DR

This work tackles semantic change detection over satellite image time series (SITS-SCD) by introducing a multi-temporal architecture that leverages long-term temporal information through a novel temporal attention mechanism, outputting a segmentation map for each timestamp. The method outperforms mono- and bi-temporal baselines on DynamicEarthNet and MUDS while scaling more efficiently with model size, though gains do not always translate to change-detection accuracy. A systematic analysis of temporal and spatial domain shifts reveals that spatial shifts cause the largest performance drops, and temporal shifts mainly degrade change-detection performance while having a smaller effect on semantic segmentation. The study highlights data scarcity and the rarity of meaningful changes as key practical limits, underscoring the need for robust domain adaptation and more diverse annotated SITS data for reliable global monitoring. Overall, the paper advances SITS-SCD by combining long-range temporal reasoning with a principled assessment of domain shift impacts, informing future research and practical deployment in Earth observation workflows.

Abstract

Satellite imagery plays a crucial role in monitoring changes happening on Earth's surface and aiding in climate analysis, ecosystem assessment, and disaster response. In this paper, we tackle semantic change detection with satellite image time series (SITS-SCD) which encompasses both change detection and semantic segmentation tasks. We propose a new architecture that improves over the state of the art, scales better with the number of parameters, and leverages long-term temporal information. However, for practical use cases, models need to adapt to spatial and temporal shifts, which remains a challenge. We investigate the impact of temporal and spatial shifts separately on global, multi-year SITS datasets using DynamicEarthNet and MUDS. We show that the spatial domain shift represents the most complex setting and that the impact of temporal shift on performance is more pronounced on change detection than on semantic segmentation, highlighting that it is a specific issue deserving further attention.

Satellite Image Time Series Semantic Change Detection: Novel Architecture and Analysis of Domain Shift

TL;DR

This work tackles semantic change detection over satellite image time series (SITS-SCD) by introducing a multi-temporal architecture that leverages long-term temporal information through a novel temporal attention mechanism, outputting a segmentation map for each timestamp. The method outperforms mono- and bi-temporal baselines on DynamicEarthNet and MUDS while scaling more efficiently with model size, though gains do not always translate to change-detection accuracy. A systematic analysis of temporal and spatial domain shifts reveals that spatial shifts cause the largest performance drops, and temporal shifts mainly degrade change-detection performance while having a smaller effect on semantic segmentation. The study highlights data scarcity and the rarity of meaningful changes as key practical limits, underscoring the need for robust domain adaptation and more diverse annotated SITS data for reliable global monitoring. Overall, the paper advances SITS-SCD by combining long-range temporal reasoning with a principled assessment of domain shift impacts, informing future research and practical deployment in Earth observation workflows.

Abstract

Satellite imagery plays a crucial role in monitoring changes happening on Earth's surface and aiding in climate analysis, ecosystem assessment, and disaster response. In this paper, we tackle semantic change detection with satellite image time series (SITS-SCD) which encompasses both change detection and semantic segmentation tasks. We propose a new architecture that improves over the state of the art, scales better with the number of parameters, and leverages long-term temporal information. However, for practical use cases, models need to adapt to spatial and temporal shifts, which remains a challenge. We investigate the impact of temporal and spatial shifts separately on global, multi-year SITS datasets using DynamicEarthNet and MUDS. We show that the spatial domain shift represents the most complex setting and that the impact of temporal shift on performance is more pronounced on change detection than on semantic segmentation, highlighting that it is a specific issue deserving further attention.
Paper Structure (43 sections, 4 equations, 10 figures, 7 tables)

This paper contains 43 sections, 4 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Overall architecture. Given an input SITS, we compute feature maps at various scales. Our contribution is the temporal attention mechanism that allows to account for long-term temporal information. The decoder branch up-scales the feature maps for all time stamps in parallel, while propagating the attention maps at all levels.
  • Figure 2: Attention mechanism of TAE, LTAE and our method. We show the temporal attention mechanism of TAE Garnot_2020_CVPR, LTAE Garnot_2020_AALTD and our method for a given patch $\mathbf{z}^L_{i,j}$ of the feature map $\mathbf{z}^L$. Here, $d$ is the dimension of the key and query vectors.
  • Figure 3: Domain shift settings. We organize the dataset splits in three different manners such that there is respectively (a) no domain shift, (b) a temporal domain shift, and (c) a spatial domain shift between train and val/test sets. DynamicEarthNet Toker_2022_CVPR images are shown here for visualization, and we use the same settings for MUDS van2021multi.
  • Figure 4: Impact of $D$ on performance. We compare the impact of the spatial feature size $D$ on performance (a) for our model and UTAE-based methods in the setting without domain shift and (b) for our model in the three domain shift settings. In each case, we report the SCS score (left) and the mean IoU (right).
  • Figure 5: Qualitative change detection results. We show the binary change detection maps predicted by our model and competing methods in the setting without domain shift for randomly selected input images. From top to bottom, we show the input pairs at time T1 (01/09/2018) and T2 (01/09/2019), the ground truth binary change map and the predictions of different methods for DynamicEarthNet (i-vi) and MUDS (vii-xi). For TSViT and UTAE we use the weekly setting for DynamicEarthNet and the monthly setting for MUDS. Best viewed in color.
  • ...and 5 more figures