Table of Contents
Fetching ...

PixelDINO: Semi-Supervised Semantic Segmentation for Detecting Permafrost Disturbances

Konrad Heidler, Ingmar Nitze, Guido Grosse, Xiao Xiang Zhu

TL;DR

PixelDINO introduces a pixel-level semi-supervised semantic segmentation framework that uses self-distillation with pixel-wise pseudo-classes to fuse labeled RTS data with large amounts of unlabeled Sentinel-2 imagery. By combining a supervised loss with a PixelDINO loss and employing weak/strong augmentations, the method improves generalization to unseen Arctic regions compared with supervised baselines and other semi-supervised approaches. The results show notable gains in IoU and F1 scores, reductions in false positives, and robust performance across coastal and inland RTS sites, highlighting the practical value of semi-supervised learning for permafrost disturbance mapping. This approach reduces the labeling burden and is adaptable to other remote sensing tasks, supporting scalable monitoring of permafrost degradation and related landforms.

Abstract

Arctic Permafrost is facing significant changes due to global climate change. As these regions are largely inaccessible, remote sensing plays a crucial rule in better understanding the underlying processes not just on a local scale, but across the Arctic. In this study, we focus on the remote detection of retrogressive thaw slumps (RTS), a permafrost disturbance comparable to landslides induced by thawing. For such analyses from space, deep learning has become an indispensable tool, but limited labelled training data remains a challenge for training accurate models. To improve model generalization across the Arctic without the need for additional labelled data, we present a semi-supervised learning approach to train semantic segmentation models to detect RTS. Our framework called PixelDINO is trained in parallel on labelled data as well as unlabelled data. For the unlabelled data, the model segments the imagery into self-taught pseudo-classes and the training procedure ensures consistency of these pseudo-classes across strong augmentations of the input data. Our experimental results demonstrate that PixelDINO can improve model performance both over supervised baseline methods as well as existing semi-supervised semantic segmentation approaches, highlighting its potential for training robust models that generalize well to regions that were not included in the training data. The project page containing code and other materials for this study can be found at \url{https://khdlr.github.io/PixelDINO/}.

PixelDINO: Semi-Supervised Semantic Segmentation for Detecting Permafrost Disturbances

TL;DR

PixelDINO introduces a pixel-level semi-supervised semantic segmentation framework that uses self-distillation with pixel-wise pseudo-classes to fuse labeled RTS data with large amounts of unlabeled Sentinel-2 imagery. By combining a supervised loss with a PixelDINO loss and employing weak/strong augmentations, the method improves generalization to unseen Arctic regions compared with supervised baselines and other semi-supervised approaches. The results show notable gains in IoU and F1 scores, reductions in false positives, and robust performance across coastal and inland RTS sites, highlighting the practical value of semi-supervised learning for permafrost disturbance mapping. This approach reduces the labeling burden and is adaptable to other remote sensing tasks, supporting scalable monitoring of permafrost degradation and related landforms.

Abstract

Arctic Permafrost is facing significant changes due to global climate change. As these regions are largely inaccessible, remote sensing plays a crucial rule in better understanding the underlying processes not just on a local scale, but across the Arctic. In this study, we focus on the remote detection of retrogressive thaw slumps (RTS), a permafrost disturbance comparable to landslides induced by thawing. For such analyses from space, deep learning has become an indispensable tool, but limited labelled training data remains a challenge for training accurate models. To improve model generalization across the Arctic without the need for additional labelled data, we present a semi-supervised learning approach to train semantic segmentation models to detect RTS. Our framework called PixelDINO is trained in parallel on labelled data as well as unlabelled data. For the unlabelled data, the model segments the imagery into self-taught pseudo-classes and the training procedure ensures consistency of these pseudo-classes across strong augmentations of the input data. Our experimental results demonstrate that PixelDINO can improve model performance both over supervised baseline methods as well as existing semi-supervised semantic segmentation approaches, highlighting its potential for training robust models that generalize well to regions that were not included in the training data. The project page containing code and other materials for this study can be found at \url{https://khdlr.github.io/PixelDINO/}.
Paper Structure (25 sections, 4 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 25 sections, 4 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Spatial distribution of the annotated training sites (red). It can be seen that the labelled data has quite limited spatial coverage. By using semi-supervised learning, it is possible to include large areas of unlabelled Sentinel-2 imagery (green) into the training process. Basemap source: brown2002_circumarctic
  • Figure 2: Overview of the self-supervised part of the PixelDINO framework for pixel-wise feature learning. First, the image is weakly augmented and a dense feature map is derived using the teacher model. These teachers are turned into class labels by centering, sharpening, and applying the softmax function. Both the weakly augmented image and the teacher label are augmented using the set of strong augmentations. The student model is then trained on this pair of image and label. Finally, the teacher model's weights are updated as an exponential moving average of the student's weights.
  • Figure 3: Prediction results for parts of the Herschel Island (top) and Lena (bottom) study sites for the Baseline+Aug and PixelDINO training methods. Most prominent is the large reduction in false positives due to the semi-supervised training method.