Land Cover Image Classification
Antonio Rangel, Juan Terven, Diana M. Cordova-Esparza, E. A. Chavez-Urbiola
TL;DR
This study benchmarks land-cover image classification on the EuroSAT dataset, comparing seven CNNs and three transformer architectures under two training regimes: from-scratch and ImageNet-based pretraining. Using cross-entropy loss and the Adam optimizer, models are evaluated with Top-1 accuracy and precision/recall on a fixed train/validation/test split. Transformers, especially MaxViT and Swin, achieve near state-of-the-art results (up to ~0.990 Top-1) with pretraining, while CNNs perform robustly and improve significantly with pretraining. The findings highlight the strong potential of transformer-based methods for remote-sensing LC tasks and the importance of transfer learning for efficient, scalable land-cover mapping.
Abstract
Land Cover (LC) image classification has become increasingly significant in understanding environmental changes, urban planning, and disaster management. However, traditional LC methods are often labor-intensive and prone to human error. This paper explores state-of-the-art deep learning models for enhanced accuracy and efficiency in LC analysis. We compare convolutional neural networks (CNN) against transformer-based methods, showcasing their applications and advantages in LC studies. We used EuroSAT, a patch-based LC classification data set based on Sentinel-2 satellite images and achieved state-of-the-art results using current transformer models.
