Table of Contents
Fetching ...

ADA-Net: Attention-Guided Domain Adaptation Network with Contrastive Learning for Standing Dead Tree Segmentation Using Aerial Imagery

Mete Ahishali, Anis Ur Rahman, Einari Heinaro, Samuli Junttila

TL;DR

The paper tackles the lack of annotated standing dead tree data across large regions by leveraging domain adaptation to transfer knowledge from annotated Finland imagery to unlabeled USA imagery. It introduces ADA-Net, an attention-guided domain adaptation network that combines adversarial image-to-image translation with spatial and frequency contrastive losses and identity regularization, enabling zero-shot cross-site segmentation on domain-shifted aerial imagery. The authors demonstrate state-of-the-art domain adaptation performance and substantial improvements in cross-site dead tree segmentation, using Finland as the target domain and the USA as the source, with Tile-StyleGAN2 yielding the best results among discriminator configurations. Public release of the annotated USA dataset and the software implementation supports further research in forest health monitoring and domain-adaptive segmentation from aerial imagery.

Abstract

Information on standing dead trees is important for understanding forest ecosystem functioning and resilience but has been lacking over large geographic regions. Climate change has caused large-scale tree mortality events that can remain undetected due to limited data. In this study, we propose a novel method for segmenting standing dead trees using aerial multispectral orthoimages. Because access to annotated datasets has been a significant problem in forest remote sensing due to the need for forest expertise, we introduce a method for domain transfer by leveraging domain adaptation to learn a transformation from a source domain X to target domain Y. In this Image-to-Image translation task, we aim to utilize available annotations in the target domain by pre-training a segmentation network. When images from a new study site without annotations are introduced (source domain X), these images are transformed into the target domain. Then, transfer learning is applied by inferring the pre-trained network on domain-adapted images. In addition to investigating the feasibility of current domain adaptation approaches for this objective, we propose a novel approach called the Attention-guided Domain Adaptation Network (ADA-Net) with enhanced contrastive learning. Accordingly, the ADA-Net approach provides new state-of-the-art domain adaptation performance levels outperforming existing approaches. We have evaluated the proposed approach using two datasets from Finland and the US. The USA images are converted to the Finland domain, and we show that the synthetic USA2Finland dataset exhibits similar characteristics to the Finland domain images. The software implementation is shared at https://github.com/meteahishali/ADA-Net. The data is publicly available at https://www.kaggle.com/datasets/meteahishali/aerial-imagery-for-standing-dead-tree-segmentation.

ADA-Net: Attention-Guided Domain Adaptation Network with Contrastive Learning for Standing Dead Tree Segmentation Using Aerial Imagery

TL;DR

The paper tackles the lack of annotated standing dead tree data across large regions by leveraging domain adaptation to transfer knowledge from annotated Finland imagery to unlabeled USA imagery. It introduces ADA-Net, an attention-guided domain adaptation network that combines adversarial image-to-image translation with spatial and frequency contrastive losses and identity regularization, enabling zero-shot cross-site segmentation on domain-shifted aerial imagery. The authors demonstrate state-of-the-art domain adaptation performance and substantial improvements in cross-site dead tree segmentation, using Finland as the target domain and the USA as the source, with Tile-StyleGAN2 yielding the best results among discriminator configurations. Public release of the annotated USA dataset and the software implementation supports further research in forest health monitoring and domain-adaptive segmentation from aerial imagery.

Abstract

Information on standing dead trees is important for understanding forest ecosystem functioning and resilience but has been lacking over large geographic regions. Climate change has caused large-scale tree mortality events that can remain undetected due to limited data. In this study, we propose a novel method for segmenting standing dead trees using aerial multispectral orthoimages. Because access to annotated datasets has been a significant problem in forest remote sensing due to the need for forest expertise, we introduce a method for domain transfer by leveraging domain adaptation to learn a transformation from a source domain X to target domain Y. In this Image-to-Image translation task, we aim to utilize available annotations in the target domain by pre-training a segmentation network. When images from a new study site without annotations are introduced (source domain X), these images are transformed into the target domain. Then, transfer learning is applied by inferring the pre-trained network on domain-adapted images. In addition to investigating the feasibility of current domain adaptation approaches for this objective, we propose a novel approach called the Attention-guided Domain Adaptation Network (ADA-Net) with enhanced contrastive learning. Accordingly, the ADA-Net approach provides new state-of-the-art domain adaptation performance levels outperforming existing approaches. We have evaluated the proposed approach using two datasets from Finland and the US. The USA images are converted to the Finland domain, and we show that the synthetic USA2Finland dataset exhibits similar characteristics to the Finland domain images. The software implementation is shared at https://github.com/meteahishali/ADA-Net. The data is publicly available at https://www.kaggle.com/datasets/meteahishali/aerial-imagery-for-standing-dead-tree-segmentation.

Paper Structure

This paper contains 16 sections, 25 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The generator network $G$ architecture of the proposed ADA-Net is presented. There are five pixel-sampling operations along the channel dimension of selected layers. These samples are stacked together and given as input to five different MLPs as $\Phi = \left\{ \phi_m \right\}^5_{m=1}$ for pixel-wise contrastive learning used in \ref{['eq:cont_loss_pixel']}. Frequency domain patches are fed into $\theta$ for patch-wise contrastive learning to computed $\mathcal{L}_\text{freq}$ in \ref{['eq:cont_loss_patch']}. The input image is from the source domain USA, while the transformed image appears to have characteristics similar to the Finland domain images.
  • Figure 2: ResNet block is detailed used in the generator network $G$.
  • Figure 3: Residual Self-Attention block with convolutional projection operations is used in the generator network $G$, where $\alpha$ is a learnable parameter in \ref{['eq:attention_final']}.
  • Figure 4: Sample images are provided from the USA and Finland datasets in RGB illustration.
  • Figure 5: Predicted standing dead trees by Flair U-Net model over USA2Finland images after applying domain adaptation with the proposed ADA-Net approach and two best competitors including CUT and Cycle-GAN approaches. The first column results are obtained without domain adaptation, and the last column indicates the ground truth data (GTD) used for the performance evaluations.
  • ...and 1 more figures