Table of Contents
Fetching ...

Efficient Building Roof Type Classification: A Domain-Specific Self-Supervised Approach

Guneet Mutreja, Ksenia Bittner

TL;DR

The paper addresses roof-type classification from aerial imagery under label scarcity by employing a domain-specific self-supervised framework built on EfficientNet backbones augmented with CBAM. It demonstrates that pretraining on the domain-specific AID dataset, combined with contrastive self-supervised methods such as SimCLR, yields high validation accuracy (up to 95.5%) that rivals transformer-based models while using far fewer parameters. The approach generalizes well to unseen test sets (Braunschweig and Roof Graph) and delivers strong computational efficiency compared with ResNet baselines. Overall, the work presents a practical, data-efficient paradigm for remote-sensing roof classification that leverages domain-specific pretraining and attention-enhanced networks to perform well with limited labeled data.

Abstract

Accurate classification of building roof types from aerial imagery is crucial for various remote sensing applications, including urban planning, disaster management, and infrastructure monitoring. However, this task is often hindered by the limited availability of labeled data for supervised learning approaches. To address this challenge, this paper investigates the effectiveness of self supervised learning with EfficientNet architectures, known for their computational efficiency, for building roof type classification. We propose a novel framework that incorporates a Convolutional Block Attention Module (CBAM) to enhance the feature extraction capabilities of EfficientNet. Furthermore, we explore the benefits of pretraining on a domain-specific dataset, the Aerial Image Dataset (AID), compared to ImageNet pretraining. Our experimental results demonstrate the superiority of our approach. Employing Simple Framework for Contrastive Learning of Visual Representations (SimCLR) with EfficientNet-B3 and CBAM achieves a 95.5% accuracy on our validation set, matching the performance of state-of-the-art transformer-based models while utilizing significantly fewer parameters. We also provide a comprehensive evaluation on two challenging test sets, demonstrating the generalization capability of our method. Notably, our findings highlight the effectiveness of domain-specific pretraining, consistently leading to higher accuracy compared to models pretrained on the generic ImageNet dataset. Our work establishes EfficientNet based self-supervised learning as a computationally efficient and highly effective approach for building roof type classification, particularly beneficial in scenarios with limited labeled data.

Efficient Building Roof Type Classification: A Domain-Specific Self-Supervised Approach

TL;DR

The paper addresses roof-type classification from aerial imagery under label scarcity by employing a domain-specific self-supervised framework built on EfficientNet backbones augmented with CBAM. It demonstrates that pretraining on the domain-specific AID dataset, combined with contrastive self-supervised methods such as SimCLR, yields high validation accuracy (up to 95.5%) that rivals transformer-based models while using far fewer parameters. The approach generalizes well to unseen test sets (Braunschweig and Roof Graph) and delivers strong computational efficiency compared with ResNet baselines. Overall, the work presents a practical, data-efficient paradigm for remote-sensing roof classification that leverages domain-specific pretraining and attention-enhanced networks to perform well with limited labeled data.

Abstract

Accurate classification of building roof types from aerial imagery is crucial for various remote sensing applications, including urban planning, disaster management, and infrastructure monitoring. However, this task is often hindered by the limited availability of labeled data for supervised learning approaches. To address this challenge, this paper investigates the effectiveness of self supervised learning with EfficientNet architectures, known for their computational efficiency, for building roof type classification. We propose a novel framework that incorporates a Convolutional Block Attention Module (CBAM) to enhance the feature extraction capabilities of EfficientNet. Furthermore, we explore the benefits of pretraining on a domain-specific dataset, the Aerial Image Dataset (AID), compared to ImageNet pretraining. Our experimental results demonstrate the superiority of our approach. Employing Simple Framework for Contrastive Learning of Visual Representations (SimCLR) with EfficientNet-B3 and CBAM achieves a 95.5% accuracy on our validation set, matching the performance of state-of-the-art transformer-based models while utilizing significantly fewer parameters. We also provide a comprehensive evaluation on two challenging test sets, demonstrating the generalization capability of our method. Notably, our findings highlight the effectiveness of domain-specific pretraining, consistently leading to higher accuracy compared to models pretrained on the generic ImageNet dataset. Our work establishes EfficientNet based self-supervised learning as a computationally efficient and highly effective approach for building roof type classification, particularly beneficial in scenarios with limited labeled data.

Paper Structure

This paper contains 13 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Classification results of BEit model pretrained using Aerial Image Dataset on building roof types in the Braunschweig area.
  • Figure 2: Visual representation of representative architecture configurations employed in the study: ResNet50, EfficientNetB3, and EfficientNetB3 enhanced with . These architectures serve as examples of the diverse backbones explored, ranging from ResNet34 to ResNet50 and EfficientNetB0 to B3 with and without .
  • Figure 3: Sample images from the dataset illustrating the four roof types: Gable, Hip, Flat, and Complex.
  • Figure 4: Performance visualization: BEit (Row 1) and (Row 2) on a manually labeled Roof Graph subsets, pretrained on AID.