Table of Contents
Fetching ...

Geometric Knowledge-Assisted Federated Dual Knowledge Distillation Approach Towards Remote Sensing Satellite Imagery

Luyao Zou, Fei Pan, Jueying Li, Yan Kyaw Tun, Apurba Adhikary, Zhu Han, Hayoung Oh

TL;DR

Evaluation over multiple datasets showcases that the proposed GK-FedDKD approach is superior to the considered state-of-the-art baselines, e.g., the proposed approach with the Swin-T backbone surpasses previous SOTA approaches by an average 68.89% on the EuroSAT dataset.

Abstract

Federated learning (FL) has recently become a promising solution for analyzing remote sensing satellite imagery (RSSI). However, the large scale and inherent data heterogeneity of images collected from multiple satellites, where the local data distribution of each satellite differs from the global one, present significant challenges to effective model training. To address this issue, we propose a Geometric Knowledge-Guided Federated Dual Knowledge Distillation (GK-FedDKD) framework for RSSI analysis. In our approach, each local client first distills a teacher encoder (TE) from multiple student encoders (SEs) trained with unlabeled augmented data. The TE is then connected with a shared classifier to form a teacher network (TN) that supervises the training of a new student network (SN). The intermediate representations of the TN are used to compute local covariance matrices, which are aggregated at the server to generate global geometric knowledge (GGK). This GGK is subsequently employed for local embedding augmentation to further guide SN training. We also design a novel loss function and a multi-prototype generation pipeline to stabilize the training process. Evaluation over multiple datasets showcases that the proposed GK-FedDKD approach is superior to the considered state-of-the-art baselines, e.g., the proposed approach with the Swin-T backbone surpasses previous SOTA approaches by an average 68.89% on the EuroSAT dataset.

Geometric Knowledge-Assisted Federated Dual Knowledge Distillation Approach Towards Remote Sensing Satellite Imagery

TL;DR

Evaluation over multiple datasets showcases that the proposed GK-FedDKD approach is superior to the considered state-of-the-art baselines, e.g., the proposed approach with the Swin-T backbone surpasses previous SOTA approaches by an average 68.89% on the EuroSAT dataset.

Abstract

Federated learning (FL) has recently become a promising solution for analyzing remote sensing satellite imagery (RSSI). However, the large scale and inherent data heterogeneity of images collected from multiple satellites, where the local data distribution of each satellite differs from the global one, present significant challenges to effective model training. To address this issue, we propose a Geometric Knowledge-Guided Federated Dual Knowledge Distillation (GK-FedDKD) framework for RSSI analysis. In our approach, each local client first distills a teacher encoder (TE) from multiple student encoders (SEs) trained with unlabeled augmented data. The TE is then connected with a shared classifier to form a teacher network (TN) that supervises the training of a new student network (SN). The intermediate representations of the TN are used to compute local covariance matrices, which are aggregated at the server to generate global geometric knowledge (GGK). This GGK is subsequently employed for local embedding augmentation to further guide SN training. We also design a novel loss function and a multi-prototype generation pipeline to stabilize the training process. Evaluation over multiple datasets showcases that the proposed GK-FedDKD approach is superior to the considered state-of-the-art baselines, e.g., the proposed approach with the Swin-T backbone surpasses previous SOTA approaches by an average 68.89% on the EuroSAT dataset.
Paper Structure (22 sections, 3 theorems, 25 equations, 10 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 3 theorems, 25 equations, 10 figures, 6 tables, 1 algorithm.

Key Result

Lemma 1

For an arbitrary client $n$, from the start of the communication round $t+1$ to the final local iteration, before the server conducts the multi-prototype aggregation process, the local loss function $\mathcal{L}_n^\textrm{loss}$ is bounded by:

Figures (10)

  • Figure 1: The discrepancy between local & global distribution for satellite imagery. Each satellite may have different categories and different amounts of data. Images are derived from the Satellite Image Classification (SIC) DatasetsatelliteClassficationDataset.
  • Figure 2: Architecture of the proposed GK-FedDKD method, which is formed by multiple clients (i.e., satellites) and a server. This architecture showcases the specific implementation of Fig. \ref{['fig_solution']}.
  • Figure 3: The overall framework of the proposed GK-FedDKD method includes: client & server parts. The client part contains: A KD with unlabeled augmented data for TE generation, B KD with labeled data and local covariance matrices calculation, C GVs-guided local embedding augmentation (LEA) and local learning regulation, D a linear layer-based module, and E multi-prototype (MP) generation strategy. The server part plays the role of 1) global covariance matrices generation, 2) model aggregation, and 3) MP aggregation.
  • Figure 4: Per-class data sample distribution across ten clients.
  • Figure 5: Data augmentation over the SIC dataset with rotation, Gaussian noise, flip, brighter, darker, saturation, and salt and pepper, respectively.
  • ...and 5 more figures

Theorems & Definitions (3)

  • Lemma 1
  • Lemma 2
  • Theorem 1