Table of Contents
Fetching ...

Towards Privacy-preserved Pre-training of Remote Sensing Foundation Models with Federated Mutual-guidance Learning

Jieyi Tan, Chengwei Zhang, Bo Dang, Yansheng Li

TL;DR

The paper tackles privacy-preserving pre-training of Remote Sensing Foundation Models by introducing FedSense, a federated self-supervised learning framework that couples Server-to-Clients Guidance (SCG) with Clients-to-Server Guidance (CSG) to break the cycle between data heterogeneity and communication overhead. SCG enforces global-flatness and aligns client representations with a universal encoder, while CSG uses low-bit quantization with error-feedback and server-side similarity distillation on public data to reduce communication without sacrificing performance. The approach supports both contrastive and masked-image-modeling SSL backbones and demonstrates superior performance and communication efficiency across four downstream RS tasks, including effective 1-bit quantization. These results indicate FedSense enables scalable, privacy-preserving collaboration among institutions for pre-training RSFMs with real-world data heterogeneity. The work lays a foundation for extending privacy-preserved, multi-modal RSFM pre-training in future work.

Abstract

Traditional Remote Sensing Foundation models (RSFMs) are pre-trained with a data-centralized paradigm, through self-supervision on large-scale curated remote sensing data. For each institution, however, pre-training RSFMs with limited data in a standalone manner may lead to suboptimal performance, while aggregating remote sensing data from multiple institutions for centralized pre-training raises privacy concerns. Seeking for collaboration is a promising solution to resolve this dilemma, where multiple institutions can collaboratively train RSFMs without sharing private data. In this paper, we propose a novel privacy-preserved pre-training framework (FedSense), which enables multiple institutions to collaboratively train RSFMs without sharing private data. However, it is a non-trivial task hindered by a vicious cycle, which results from model drift by remote sensing data heterogeneity and high communication overhead. To break this vicious cycle, we introduce Federated Mutual-guidance Learning. Specifically, we propose a Server-to-Clients Guidance (SCG) mechanism to guide clients updates towards global-flatness optimal solutions. Additionally, we propose a Clients-to-Server Guidance (CSG) mechanism to inject local knowledge into the server by low-bit communication. Extensive experiments on four downstream tasks demonstrate the effectiveness of our FedSense in both full-precision and communication-reduced scenarios, showcasing remarkable communication efficiency and performance gains.

Towards Privacy-preserved Pre-training of Remote Sensing Foundation Models with Federated Mutual-guidance Learning

TL;DR

The paper tackles privacy-preserving pre-training of Remote Sensing Foundation Models by introducing FedSense, a federated self-supervised learning framework that couples Server-to-Clients Guidance (SCG) with Clients-to-Server Guidance (CSG) to break the cycle between data heterogeneity and communication overhead. SCG enforces global-flatness and aligns client representations with a universal encoder, while CSG uses low-bit quantization with error-feedback and server-side similarity distillation on public data to reduce communication without sacrificing performance. The approach supports both contrastive and masked-image-modeling SSL backbones and demonstrates superior performance and communication efficiency across four downstream RS tasks, including effective 1-bit quantization. These results indicate FedSense enables scalable, privacy-preserving collaboration among institutions for pre-training RSFMs with real-world data heterogeneity. The work lays a foundation for extending privacy-preserved, multi-modal RSFM pre-training in future work.

Abstract

Traditional Remote Sensing Foundation models (RSFMs) are pre-trained with a data-centralized paradigm, through self-supervision on large-scale curated remote sensing data. For each institution, however, pre-training RSFMs with limited data in a standalone manner may lead to suboptimal performance, while aggregating remote sensing data from multiple institutions for centralized pre-training raises privacy concerns. Seeking for collaboration is a promising solution to resolve this dilemma, where multiple institutions can collaboratively train RSFMs without sharing private data. In this paper, we propose a novel privacy-preserved pre-training framework (FedSense), which enables multiple institutions to collaboratively train RSFMs without sharing private data. However, it is a non-trivial task hindered by a vicious cycle, which results from model drift by remote sensing data heterogeneity and high communication overhead. To break this vicious cycle, we introduce Federated Mutual-guidance Learning. Specifically, we propose a Server-to-Clients Guidance (SCG) mechanism to guide clients updates towards global-flatness optimal solutions. Additionally, we propose a Clients-to-Server Guidance (CSG) mechanism to inject local knowledge into the server by low-bit communication. Extensive experiments on four downstream tasks demonstrate the effectiveness of our FedSense in both full-precision and communication-reduced scenarios, showcasing remarkable communication efficiency and performance gains.

Paper Structure

This paper contains 20 sections, 3 theorems, 29 equations, 5 figures, 7 tables, 1 algorithm.

Key Result

Lemma 1

Under Assumption assump:grad, the optimal perturbation $\widetilde{\epsilon}$ in SCG satisfies:

Figures (5)

  • Figure 1: Illustration of privacy-preserved pre-training of RSFMs with FL to bridge data islands. The vicious cycle between data heterogeneity-induced model drift and communication bottlenecks reveals a critical performance-efficiency trade-off.
  • Figure 2: Overview of FedSense. The framework includes two components: Server-to-Clients Guidance (SCG) and Clients-to-Server Guidance (CSG). SCG guides clients' updates towards global-flatness optimal solutions, while CSG injects local knowledge into the server by low-bit communication.
  • Figure 3: Details of federated pre-training datasets. The dataset consists of 10 clients with million-scale heterogeneous private remote sensing data and public datasets maintained by a server.
  • Figure 4: Pipeline of privacy-preserved pre-training of RSFMs.
  • Figure 5: Illustration on downstream usage of collaboratively pre-trained RSFMs to accommodate various Earth Observation tasks.

Theorems & Definitions (6)

  • Lemma 1: Optimal Perturbation Bound
  • Proof 1
  • Lemma 2: Quantization Error Decay
  • Proof 2
  • Theorem 1: Convergence Guarantee
  • Proof 3: Proof Sketch