Table of Contents
Fetching ...

Partial Federated Learning

Tiantian Feng, Anil Ramakrishna, Jimit Majmudar, Charith Peris, Jixuan Wang, Clement Chung, Richard Zemel, Morteza Ziyadi, Rahul Gupta

TL;DR

The paper tackles the challenge of heterogeneous data modalities in federated learning by introducing PartialFL, a framework that lets a subset of modalities (e.g., text) be shared with the server while others (e.g., audio) remain on-device. It combines a server-side encoder trained on the shareable modality, a global FL model trained on non-shareable data, and local edge models, augmented with cross-modal and embedding alignment losses to transfer knowledge across modalities and devices. The learning algorithm relies on asynchronous alternating minimization with contrastive objectives that avoid sharing labels, and experiments on SER and Food-101 datasets show that PartialFL outperforms standard FL and SL baselines and approaches centralized performance, highlighting robustness to data heterogeneity. The work advances privacy-preserving, multi-modal FL by enabling larger, better-aligned embeddings trained across distributed modalities, with practical considerations around privacy risks and future deployment.

Abstract

Federated Learning (FL) is a popular algorithm to train machine learning models on user data constrained to edge devices (for example, mobile phones) due to privacy concerns. Typically, FL is trained with the assumption that no part of the user data can be egressed from the edge. However, in many production settings, specific data-modalities/meta-data are limited to be on device while others are not. For example, in commercial SLU systems, it is typically desired to prevent transmission of biometric signals (such as audio recordings of the input prompt) to the cloud, but egress of locally (i.e. on the edge device) transcribed text to the cloud may be possible. In this work, we propose a new algorithm called Partial Federated Learning (PartialFL), where a machine learning model is trained using data where a subset of data modalities or their intermediate representations can be made available to the server. We further restrict our model training by preventing the egress of data labels to the cloud for better privacy, and instead use a contrastive learning based model objective. We evaluate our approach on two different multi-modal datasets and show promising results with our proposed approach.

Partial Federated Learning

TL;DR

The paper tackles the challenge of heterogeneous data modalities in federated learning by introducing PartialFL, a framework that lets a subset of modalities (e.g., text) be shared with the server while others (e.g., audio) remain on-device. It combines a server-side encoder trained on the shareable modality, a global FL model trained on non-shareable data, and local edge models, augmented with cross-modal and embedding alignment losses to transfer knowledge across modalities and devices. The learning algorithm relies on asynchronous alternating minimization with contrastive objectives that avoid sharing labels, and experiments on SER and Food-101 datasets show that PartialFL outperforms standard FL and SL baselines and approaches centralized performance, highlighting robustness to data heterogeneity. The work advances privacy-preserving, multi-modal FL by enabling larger, better-aligned embeddings trained across distributed modalities, with practical considerations around privacy risks and future deployment.

Abstract

Federated Learning (FL) is a popular algorithm to train machine learning models on user data constrained to edge devices (for example, mobile phones) due to privacy concerns. Typically, FL is trained with the assumption that no part of the user data can be egressed from the edge. However, in many production settings, specific data-modalities/meta-data are limited to be on device while others are not. For example, in commercial SLU systems, it is typically desired to prevent transmission of biometric signals (such as audio recordings of the input prompt) to the cloud, but egress of locally (i.e. on the edge device) transcribed text to the cloud may be possible. In this work, we propose a new algorithm called Partial Federated Learning (PartialFL), where a machine learning model is trained using data where a subset of data modalities or their intermediate representations can be made available to the server. We further restrict our model training by preventing the egress of data labels to the cloud for better privacy, and instead use a contrastive learning based model objective. We evaluate our approach on two different multi-modal datasets and show promising results with our proposed approach.
Paper Structure (39 sections, 4 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 39 sections, 4 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Different models in the PartialFL learning architecture.
  • Figure 2: Model performance for the IEMOCAP and MSP-Improv data set. PartialFL considerably outperforms FL and SL and has performance close to the Centralized upper bound.
  • Figure 3: Training steps of the PartialFL framework. Training steps 1, 2, 3, and 4 are repeated in each global training round.
  • Figure 4: Top-5 accuracies on test set for PartialFL compared to other baselines in the UPMC Food-101 data set. In all cases, we repeat the experiment 5 times with different seed numbers and report average performances. In PartialFL, we report best performance from different temperature values. We use $\alpha = 1.0$ in all federated experiments (FL, SL and PartialFL)