Table of Contents
Fetching ...

One-Shot Federated Unsupervised Domain Adaptation with Scaled Entropy Attention and Multi-Source Smoothed Pseudo Labeling

Ali Abedi, Q. M. Jonathan Wu, Ning Zhang, Farhad Pourpanah

TL;DR

This work proposes a one-shot Federated Unsupervised Domain Adaptation (FUDA) method that outperforms state-of-the-art methods across four standard benchmarks while reducing communication and computation costs, making it highly suitable for real-world applications.

Abstract

Federated Learning (FL) is a promising approach for privacy-preserving collaborative learning. However, it faces significant challenges when dealing with domain shifts, especially when each client has access only to its source data and cannot share it during target domain adaptation. Moreover, FL methods often require high communication overhead due to multiple rounds of model updates between clients and the server. We propose a one-shot Federated Unsupervised Domain Adaptation (FUDA) method to address these limitations. Specifically, we introduce Scaled Entropy Attention (SEA) for model aggregation and Multi-Source Pseudo Labeling (MSPL) for target domain adaptation. SEA uses scaled prediction entropy on target domain to assign higher attention to reliable models. This improves the global model quality and ensures balanced weighting of contributions. MSPL distills knowledge from multiple source models to generate pseudo labels and manage noisy labels using smoothed soft-label cross-entropy (SSCE). Our approach outperforms state-of-the-art methods across four standard benchmarks while reducing communication and computation costs, making it highly suitable for real-world applications. The implementation code will be made publicly available upon publication.

One-Shot Federated Unsupervised Domain Adaptation with Scaled Entropy Attention and Multi-Source Smoothed Pseudo Labeling

TL;DR

This work proposes a one-shot Federated Unsupervised Domain Adaptation (FUDA) method that outperforms state-of-the-art methods across four standard benchmarks while reducing communication and computation costs, making it highly suitable for real-world applications.

Abstract

Federated Learning (FL) is a promising approach for privacy-preserving collaborative learning. However, it faces significant challenges when dealing with domain shifts, especially when each client has access only to its source data and cannot share it during target domain adaptation. Moreover, FL methods often require high communication overhead due to multiple rounds of model updates between clients and the server. We propose a one-shot Federated Unsupervised Domain Adaptation (FUDA) method to address these limitations. Specifically, we introduce Scaled Entropy Attention (SEA) for model aggregation and Multi-Source Pseudo Labeling (MSPL) for target domain adaptation. SEA uses scaled prediction entropy on target domain to assign higher attention to reliable models. This improves the global model quality and ensures balanced weighting of contributions. MSPL distills knowledge from multiple source models to generate pseudo labels and manage noisy labels using smoothed soft-label cross-entropy (SSCE). Our approach outperforms state-of-the-art methods across four standard benchmarks while reducing communication and computation costs, making it highly suitable for real-world applications. The implementation code will be made publicly available upon publication.

Paper Structure

This paper contains 24 sections, 5 theorems, 42 equations, 6 figures, 8 tables, 1 algorithm.

Key Result

Lemma 1

Standard cross-entropy encourages the model to place nearly all probability mass on a single class. This leads to overconfidence and reduced generalization. The proof is provided in Supplementary Materials (Section proof:lemma_ce).

Figures (6)

  • Figure 1: Overview of our proposed FUDA framework. (a) Each client trains a local source model and sends only the bottleneck and head parameters to the server, which then aggregates these models with SEA and refines the global model for the target domain using MSPL. (b) SEA aggregation process. It calculates entropy attention weights to generate the global model. (c) MSPL module. It generates pseudo labels for the target domain using source models and trains the global model with SSCE loss.
  • Figure 2: Samples from three different datasets, showcasing various domains and classes from OfficeHome, Office-31, and DomainNet
  • Figure 3: Sensitivity analysis of the proposed model. Impact of $\epsilon$ on SSCE loss.
  • Figure 4: Accuracy of source models for each OfficeHome target domain.
  • Figure 5: t-SNE visualizations of target domain feature distributions for both SEA and SEA + MSPL.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Lemma 1
  • Claim 1
  • Theorem 1
  • Lemma 2
  • Claim 2
  • Theorem 2: Bound on Aggregated Target Risk
  • Lemma 3
  • Claim 3