Table of Contents
Fetching ...

Quantum Federated Learning for Multimodal Data: A Modality-Agnostic Approach

Atit Pokharel, Ratun Rahman, Thomas Morris, Dinh C. Nguyen

TL;DR

This paper addresses the gap in quantum federated learning (QFL) for multimodal data by proposing a modality-agnostic mmQFL framework that uses an entanglement-based fusion layer to capture cross-modal correlations. It introduces a Missing Modality Agnostic (MMA) mechanism that isolates absent modalities with no-op gates and context vectors, ensuring stable training even with incomplete data. Through simulations on CMU-MOSEI, mmQFL demonstrates improved accuracy over state-of-the-art methods under both IID and non-IID distributions, and across missing-modality scenarios. The work advances privacy-preserving, scalable quantum learning for complex real-world tasks and provides a robust blueprint for multimodal quantum FL systems.

Abstract

Quantum federated learning (QFL) has been recently introduced to enable a distributed privacy-preserving quantum machine learning (QML) model training across quantum processors (clients). Despite recent research efforts, existing QFL frameworks predominantly focus on unimodal systems, limiting their applicability to real-world tasks that often naturally involve multiple modalities. To fill this significant gap, we present for the first time a novel multimodal approach specifically tailored for the QFL setting with the intermediate fusion using quantum entanglement. Furthermore, to address a major bottleneck in multimodal QFL, where the absence of certain modalities during training can degrade model performance, we introduce a Missing Modality Agnostic (MMA) mechanism that isolates untrained quantum circuits, ensuring stable training without corrupted states. Simulation results demonstrate that the proposed multimodal QFL method with MMA yields an improvement in accuracy of 6.84% in independent and identically distributed (IID) and 7.25% in non-IID data distributions compared to the state-of-the-art methods.

Quantum Federated Learning for Multimodal Data: A Modality-Agnostic Approach

TL;DR

This paper addresses the gap in quantum federated learning (QFL) for multimodal data by proposing a modality-agnostic mmQFL framework that uses an entanglement-based fusion layer to capture cross-modal correlations. It introduces a Missing Modality Agnostic (MMA) mechanism that isolates absent modalities with no-op gates and context vectors, ensuring stable training even with incomplete data. Through simulations on CMU-MOSEI, mmQFL demonstrates improved accuracy over state-of-the-art methods under both IID and non-IID distributions, and across missing-modality scenarios. The work advances privacy-preserving, scalable quantum learning for complex real-world tasks and provides a robust blueprint for multimodal quantum FL systems.

Abstract

Quantum federated learning (QFL) has been recently introduced to enable a distributed privacy-preserving quantum machine learning (QML) model training across quantum processors (clients). Despite recent research efforts, existing QFL frameworks predominantly focus on unimodal systems, limiting their applicability to real-world tasks that often naturally involve multiple modalities. To fill this significant gap, we present for the first time a novel multimodal approach specifically tailored for the QFL setting with the intermediate fusion using quantum entanglement. Furthermore, to address a major bottleneck in multimodal QFL, where the absence of certain modalities during training can degrade model performance, we introduce a Missing Modality Agnostic (MMA) mechanism that isolates untrained quantum circuits, ensuring stable training without corrupted states. Simulation results demonstrate that the proposed multimodal QFL method with MMA yields an improvement in accuracy of 6.84% in independent and identically distributed (IID) and 7.25% in non-IID data distributions compared to the state-of-the-art methods.

Paper Structure

This paper contains 22 sections, 8 equations, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: An overview framework of the proposed mmQFL approach, where each quantum client has $M$ modalities. Each client trains separate QNN models for each modality and fuses them using a quantum fusion layer before sending the fused model to the quantum server for model aggregation.
  • Figure 2: Performance comparison between quantum model approach and classical approach. In (a) for image data classification, we use CNN for the classical approach and the iQNN model for the quantum approach. Similarly, in (b) and (c), we use LSTM for the classical approach and aQNN and tQNN for the quantum approach in audio and text data respectively.
  • Figure 3: Comparison results between without MMA and with MMA in non-IID data distribution. In epochs 25-35, 50-60, and 80-90, 1% of audio, image, and text data is missing, respectively.