FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data

Shusen Jing; Anlan Yu; Shuai Zhang; Songyang Zhang

FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data

Shusen Jing, Anlan Yu, Shuai Zhang, Songyang Zhang

TL;DR

This paper addresses federated self-supervised learning when the global objective cannot be written as a simple sum of local objectives. It introduces FedSC, a provable FedSSL method built on the spectral contrastive objective and enabled by sharing correlation matrices among clients, complemented by differential privacy to bound additional data leakage. The authors provide convergence guarantees to a stationary point with a near $\mathcal{O}(1/\sqrt{T})$ rate and quantify the extra privacy leakage from sharing correlation matrices, showing it diminishes with larger local datasets. Empirically, FedSC achieves superior or competitive accuracy on SVHN, CIFAR-10, and CIFAR-100 under non-i.i.d. data distributions and remains robust under partial participation and DP protection. Overall, FedSC offers a theoretically grounded, communication-efficient, and privacy-aware approach to FedSSL with enhanced inter-client representation quality.

Abstract

Recent efforts have been made to integrate self-supervised learning (SSL) with the framework of federated learning (FL). One unique challenge of federated self-supervised learning (FedSSL) is that the global objective of FedSSL usually does not equal the weighted sum of local SSL objectives. Consequently, conventional approaches, such as federated averaging (FedAvg), fail to precisely minimize the FedSSL global objective, often resulting in suboptimal performance, especially when data is non-i.i.d.. To fill this gap, we propose a provable FedSSL algorithm, named FedSC, based on the spectral contrastive objective. In FedSC, clients share correlation matrices of data representations in addition to model weights periodically, which enables inter-client contrast of data samples in addition to intra-client contrast and contraction, resulting in improved quality of data representations. Differential privacy (DP) protection is deployed to control the additional privacy leakage on local datasets when correlation matrices are shared. We also provide theoretical analysis on the convergence and extra privacy leakage. The experimental results validate the effectiveness of our proposed algorithm.

FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data

TL;DR

rate and quantify the extra privacy leakage from sharing correlation matrices, showing it diminishes with larger local datasets. Empirically, FedSC achieves superior or competitive accuracy on SVHN, CIFAR-10, and CIFAR-100 under non-i.i.d. data distributions and remains robust under partial participation and DP protection. Overall, FedSC offers a theoretically grounded, communication-efficient, and privacy-aware approach to FedSSL with enhanced inter-client representation quality.

Abstract

Paper Structure (29 sections, 10 theorems, 89 equations, 2 figures, 6 tables, 2 algorithms)

This paper contains 29 sections, 10 theorems, 89 equations, 2 figures, 6 tables, 2 algorithms.

Introduction
Related Works
Preliminaries: Spectral Contrastive (SC) Self-supervised Learning
Problem Formulation
FedSC: A Provable FedSSL Method
Correlation Matrices Sharing
Local Training
Comparison with existing FedSSL frameworks
Theoretical Analysis
Additional Privacy Leakage
Convergence of FedSC
Superior performance of FedSC
Sketch of Proof
Experiments
Experimental Setup
...and 14 more sections

Key Result

Lemma 6.3

Let $f:\mathcal{X} \rightarrow \mathbb{R}^n$ be a function with $l_2$ sensitivity $W$, then the Gaussian mechanism $G_f(\cdot) = f(\cdot) + \mathcal{N}(0,\mathbf{I}_n\sigma^2)$ is $(\alpha, \frac{\alpha W^2}{2\sigma^2})$-RDP.

Figures (2)

Figure 1: Diagram of the proposed FedSC. 1) The server synchronizes local models with the global model. 2) Clients compute their local correlation matrices of dataset and send them to the server. 3) The server distributes the aggregated global correlation matrices back to the clients. 4) The clients proceed to update their local models in accordance with the local objective specified in Eq. (\ref{['eq:local']}). 5) The server aggregates the local models and initiates the next iteration.
Figure 2: Convergence of FedSC and FedAvg+SC. 1) FedAvg+SC tends to experience either a high error floor or overfitting. 2) FedSC is able to consistently enhance KNN accuracy. This observation validates our theoretical analysis in Sec. \ref{['sec:the']}.

Theorems & Definitions (18)

Definition 6.1: $(\epsilon,\delta)$-DP
Definition 6.2: $(\alpha,\epsilon)$-RDP mironov2017renyi
Lemma 6.3: Gaussian Mechanism of RDP mironov2017renyi
Lemma 6.4: Composition of RDP mironov2017renyi
Lemma 6.5: mironov2017renyi
Proposition 6.6: Additional Privacy Leakage of FedSC
proof
Theorem 6.10
Lemma 2.4
proof
...and 8 more

FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data

TL;DR

Abstract

FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (18)