Table of Contents
Fetching ...

Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning

Girmaw Abebe Tadesse, Caleb Robinson, Gilles Quentin Hacheme, Akram Zaytar, Rahul Dodhia, Tsering Wangyal Shawa, Juan M. Lavista Ferres, Emmanuel H. Kreike

TL;DR

The paper presents a workflow for detecting Waterholes, Omuti homesteads, and Big trees in historical Namibia aerial photographs (1943 and 1972) using a U-Net semantic segmentation model trained on sparse annotations. It introduces a class-weighting scheme and a semi-supervised pseudo-labeling approach with an empirical p-value post-processing step to mitigate label sparsity and inter-class imbalance. Results show that these strategies significantly improve $F_1$ scores, with notable cross-temporal environmental insights such as increases in Waterhole and Big Tree sizes and a decrease in Omuti homesteads, illustrating the value of archival imagery for long-term environmental analysis. The work also demonstrates scalability to larger geographic extents and highlights the potential for uncovering previously unlabeled features, encouraging broader digitization and analysis of archival aerial photos for climate and sustainability research in Africa.

Abstract

This study explores object detection in historical aerial photographs of Namibia to identify long-term environmental changes. Specifically, we aim to identify key objects -- Waterholes, Omuti homesteads, and Big trees -- around Oshikango in Namibia using sub-meter gray-scale aerial imagery from 1943 and 1972. In this work, we propose a workflow for analyzing historical aerial imagery using a deep semantic segmentation model on sparse hand-labels. To this end, we employ a number of strategies including class-weighting, pseudo-labeling and empirical p-value-based filtering to balance skewed and sparse representations of objects in the ground truth data. Results demonstrate the benefits of these different training strategies resulting in an average $F_1=0.661$ and $F_1=0.755$ over the three objects of interest for the 1943 and 1972 imagery, respectively. We also identified that the average size of Waterhole and Big trees increased while the average size of Omuti homesteads decreased between 1943 and 1972 reflecting some of the local effects of the massive post-Second World War economic, agricultural, demographic, and environmental changes. This work also highlights the untapped potential of historical aerial photographs in understanding long-term environmental changes beyond Namibia (and Africa). With the lack of adequate satellite technology in the past, archival aerial photography offers a great alternative to uncover decades-long environmental changes.

Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning

TL;DR

The paper presents a workflow for detecting Waterholes, Omuti homesteads, and Big trees in historical Namibia aerial photographs (1943 and 1972) using a U-Net semantic segmentation model trained on sparse annotations. It introduces a class-weighting scheme and a semi-supervised pseudo-labeling approach with an empirical p-value post-processing step to mitigate label sparsity and inter-class imbalance. Results show that these strategies significantly improve scores, with notable cross-temporal environmental insights such as increases in Waterhole and Big Tree sizes and a decrease in Omuti homesteads, illustrating the value of archival imagery for long-term environmental analysis. The work also demonstrates scalability to larger geographic extents and highlights the potential for uncovering previously unlabeled features, encouraging broader digitization and analysis of archival aerial photos for climate and sustainability research in Africa.

Abstract

This study explores object detection in historical aerial photographs of Namibia to identify long-term environmental changes. Specifically, we aim to identify key objects -- Waterholes, Omuti homesteads, and Big trees -- around Oshikango in Namibia using sub-meter gray-scale aerial imagery from 1943 and 1972. In this work, we propose a workflow for analyzing historical aerial imagery using a deep semantic segmentation model on sparse hand-labels. To this end, we employ a number of strategies including class-weighting, pseudo-labeling and empirical p-value-based filtering to balance skewed and sparse representations of objects in the ground truth data. Results demonstrate the benefits of these different training strategies resulting in an average and over the three objects of interest for the 1943 and 1972 imagery, respectively. We also identified that the average size of Waterhole and Big trees increased while the average size of Omuti homesteads decreased between 1943 and 1972 reflecting some of the local effects of the massive post-Second World War economic, agricultural, demographic, and environmental changes. This work also highlights the untapped potential of historical aerial photographs in understanding long-term environmental changes beyond Namibia (and Africa). With the lack of adequate satellite technology in the past, archival aerial photography offers a great alternative to uncover decades-long environmental changes.
Paper Structure (15 sections, 6 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of the proposed approach. Our study focuses on identifying objects of interest from decades-long aerial photos (1943-1972) to study long-term environmental changes in: (A) the Oshikango region ($\approx 5000$$km^2$) in the North-Central Namibia; (B) a $45$$km^2$ area in Oshikango region was sparsely annotated and used as train and test region in our framework; (C) representative examples were annotated for the classes: Big Tree, Omuti and Waterhole; (D) a deep learning framework that aims to apply a semantic segmentation on the aerial photos and trained with different strategies; (E) insights are extracted to understand the change between 1943 and 1972.
  • Figure 2: The block-diagram of the proposed segmentation framework, where the highlighted blocks constitute the main contributions. Given a sparse set of annotated data for a $~\approx 45~km^2$ area of the Oshikango region, we, first, split the data spatially to non-overlapping train and test sets. We employed a deep learning model for the segmentation task, which utilizes a U-Netronneberger2015u architecture using a pre-trained ResNet he2016deep architecture as a back bone. The Training step utilizes a Class Weighting strategy due to the sparse nature of the annotation, and Pseudo-labeling to exploit the originally unannotated part of the train set. Inference is performed at pixel and polygon levels. The Evaluation step adopts segmentation metrics to quantify the performance of the model. Evaluations sets include the test set (with ground truth data) and the whole Oshikango region.
  • Figure 3: Visualizations of the areas of predicted polygons from 1943 imagery as (a) histogram and (b) empirical value compared to the polygons in the training set for each class. It is clear that the histogram distributions are heavily skewed and the post-processing filter will be are very sensitive of the threshold value. On the other hand, the empirical p-value distributions show more balance and hence less sensitive for a threshold-based post-processing.
  • Figure 4: Examples of annotated aerial photos and the three objects of interest, i.e., Waterholes, Omuti homesteads and Big trees
  • Figure 5: False positive could be previously unlabeled objects during the annotation (a) is the ground truth data where two objects were not labeled during annotation, and the arrows show the zoomed version of these objects for better visualization, and (b) these two objects were then detected as Waterholes during inference time (marked with dots). Note that Big trees and Omuti homesteads are marked with green and blue markers, respectively.
  • ...and 1 more figures