Federated Learning Across Decentralized and Unshared Archives for Remote Sensing Image Classification
Barış Büyüktaş, Gencer Sumbul, Begüm Demir
TL;DR
This paper addresses federated learning for remote sensing image classification across decentralized and unshared archives by conducting the first RS-focused comparative study of state-of-the-art FL algorithms. It combines a systematic CV/ML FL survey with a RS-centric theoretical comparison across local training complexity, aggregation complexity, learning efficiency, communication cost, and scalability, followed by extensive RS experiments on BigEarthNet-S2 using three decentralization scenarios. Key findings show that local-training-focused methods (e.g., MOON, FedDC) offer robustness to non-IID data, while aggregation-focused approaches (e.g., FedNova, FedBN, pFedLA) deliver competitive performance with different cost-profile trade-offs; MOON provides strong early performance but at higher computation, whereas FedBN/FedDC balance robustness and cost. The work delivers practical guidelines for RS FL deployment, highlights the potential for privacy-preserving and multi-modal RS data integration, and provides public code to enable broader RS FL research and applications.
Abstract
Federated learning (FL) enables the collaboration of multiple deep learning models to learn from decentralized data archives (i.e., clients) without accessing data on clients. Although FL offers ample opportunities in knowledge discovery from distributed image archives, it is seldom considered in remote sensing (RS). In this paper, as a first time in RS, we present a comparative study of state-of-the-art FL algorithms for RS image classification problems. To this end, we initially provide a systematic review of the FL algorithms presented in the computer vision and machine learning communities. Then, we select several state-of-the-art FL algorithms based on their effectiveness with respect to training data heterogeneity across clients (known as non-IID data). After presenting an extensive overview of the selected algorithms, a theoretical comparison of the algorithms is conducted based on their: 1) local training complexity; 2) aggregation complexity; 3) learning efficiency; 4) communication cost; and 5) scalability in terms of number of clients. After the theoretical comparison, experimental analyses are presented to compare them under different decentralization scenarios. For the experimental analyses, we focus our attention on multi-label image classification problems in RS. Based on our comprehensive analyses, we finally derive a guideline for selecting suitable FL algorithms in RS. The code of this work is publicly available at https://git.tu-berlin.de/rsim/FL-RS.
