Table of Contents
Fetching ...

Cooperative Pseudo Labeling for Unsupervised Federated Classification

Kuangpu Guo, Lijun Sheng, Yongcan Yu, Jian Liang, Zilei Wang, Ran He

TL;DR

This work addresses unsupervised federated classification by leveraging CLIP in a distributed, unlabeled-data setting. It introduces FedCoPL, a framework that combines cooperative pseudo labeling to mitigate CLIP bias and label skew with partial prompt aggregation to balance global collaboration and local personalization. Empirical results across diverse datasets and skew types show that FedCoPL consistently outperforms baselines, with ablations validating the contributions of both pseudo labeling and prompt aggregation. The approach enables effective zero-shot-like classification in federated environments while preserving client privacy and reducing communication overhead, marking a practical step toward robust unsupervised FL with vision-language models.

Abstract

Unsupervised Federated Learning (UFL) aims to collaboratively train a global model across distributed clients without sharing data or accessing label information. Previous UFL works have predominantly focused on representation learning and clustering tasks. Recently, vision language models (e.g., CLIP) have gained significant attention for their powerful zero-shot prediction capabilities. Leveraging this advancement, classification problems that were previously infeasible under the UFL paradigm now present promising new opportunities, yet remain largely unexplored. In this paper, we extend UFL to the classification problem with CLIP for the first time and propose a novel method, \underline{\textbf{Fed}}erated \underline{\textbf{Co}}operative \underline{\textbf{P}}seudo \underline{\textbf{L}}abeling (\textbf{FedCoPL}). Specifically, clients estimate and upload their pseudo label distribution, and the server adjusts and redistributes them to avoid global imbalance among classes. Moreover, we introduce a partial prompt aggregation protocol for effective collaboration and personalization. In particular, visual prompts containing general image features are aggregated at the server, while text prompts encoding personalized knowledge are retained locally. Extensive experiments demonstrate the superior performance of our FedCoPL compared to baseline methods. Our code is available at \href{https://github.com/krumpguo/FedCoPL}{https://github.com/krumpguo/FedCoPL}.

Cooperative Pseudo Labeling for Unsupervised Federated Classification

TL;DR

This work addresses unsupervised federated classification by leveraging CLIP in a distributed, unlabeled-data setting. It introduces FedCoPL, a framework that combines cooperative pseudo labeling to mitigate CLIP bias and label skew with partial prompt aggregation to balance global collaboration and local personalization. Empirical results across diverse datasets and skew types show that FedCoPL consistently outperforms baselines, with ablations validating the contributions of both pseudo labeling and prompt aggregation. The approach enables effective zero-shot-like classification in federated environments while preserving client privacy and reducing communication overhead, marking a practical step toward robust unsupervised FL with vision-language models.

Abstract

Unsupervised Federated Learning (UFL) aims to collaboratively train a global model across distributed clients without sharing data or accessing label information. Previous UFL works have predominantly focused on representation learning and clustering tasks. Recently, vision language models (e.g., CLIP) have gained significant attention for their powerful zero-shot prediction capabilities. Leveraging this advancement, classification problems that were previously infeasible under the UFL paradigm now present promising new opportunities, yet remain largely unexplored. In this paper, we extend UFL to the classification problem with CLIP for the first time and propose a novel method, \underline{\textbf{Fed}}erated \underline{\textbf{Co}}operative \underline{\textbf{P}}seudo \underline{\textbf{L}}abeling (\textbf{FedCoPL}). Specifically, clients estimate and upload their pseudo label distribution, and the server adjusts and redistributes them to avoid global imbalance among classes. Moreover, we introduce a partial prompt aggregation protocol for effective collaboration and personalization. In particular, visual prompts containing general image features are aggregated at the server, while text prompts encoding personalized knowledge are retained locally. Extensive experiments demonstrate the superior performance of our FedCoPL compared to baseline methods. Our code is available at \href{https://github.com/krumpguo/FedCoPL}{https://github.com/krumpguo/FedCoPL}.

Paper Structure

This paper contains 19 sections, 6 equations, 7 figures, 11 tables, 1 algorithm.

Figures (7)

  • Figure 1: The overview of our FedCoPL. For pseudo labeling, FedCoPL begins by filtering the initial unlabeled samples to estimate the local distributions, which are uploaded to the server. Then, the server globally selects $M$ pseudo labels for each category and allocates them to clients based on local estimated distributions. To handle label skews during training, we aggregate only visual prompts on the server to enhance global performance because the differences in textual prompts are significantly greater than those found in visual prompts.
  • Figure 2: Drift diversity and cosine distance of prompts among clients during training in CIFAR10 krizhevsky2009learning dataset. The differences observed in textual prompts are significantly greater than those observed in visual prompts.
  • Figure 3: Results of experiments with various client numbers and different client joining rates under Dirichlet-based label skews ($\beta=0.1$). FPL menghini2023enhancing is adopted as the baseline pseudo labeling method.
  • Figure 4: Drift diversity and cosine distance of prompts among clients during training in DTD cimpoi2014describing dataset. The differences observed in textual prompts are significantly greater than those found in visual prompts.
  • Figure 5: Drift diversity and cosine distance of prompts among clients during training in RESISC45 cheng2017remote dataset.
  • ...and 2 more figures