Federated Modality-specific Encoders and Partially Personalized Fusion Decoder for Multimodal Brain Tumor Segmentation

Hong Liu; Dong Wei; Qian Dai; Xian Wu; Yefeng Zheng; Liansheng Wang

Federated Modality-specific Encoders and Partially Personalized Fusion Decoder for Multimodal Brain Tumor Segmentation

Hong Liu, Dong Wei, Qian Dai, Xian Wu, Yefeng Zheng, Liansheng Wang

TL;DR

A new FL framework with federated modality-specific encoders and partially personalized multimodal fusion decoders (FedMEPD) to address the two concurrent issues of intermodal heterogeneity and outperforms various up-to-date methods for multimodal and personalized FL.

Abstract

Most existing federated learning (FL) methods for medical image analysis only considered intramodal heterogeneity, limiting their applicability to multimodal imaging applications. In practice, some FL participants may possess only a subset of the complete imaging modalities, posing intermodal heterogeneity as a challenge to effectively training a global model on all participants' data. Meanwhile, each participant expects a personalized model tailored to its local data characteristics in FL. This work proposes a new FL framework with federated modality-specific encoders and partially personalized multimodal fusion decoders (FedMEPD) to address the two concurrent issues. Specifically, FedMEPD employs an exclusive encoder for each modality to account for the intermodal heterogeneity. While these encoders are fully federated, the decoders are partially personalized to meet individual needs -- using the discrepancy between global and local parameter updates to dynamically determine which decoder filters are personalized. Implementation-wise, a server with full-modal data employs a fusion decoder to fuse representations from all modality-specific encoders, thus bridging the modalities to optimize the encoders via backpropagation. Moreover, multiple anchors are extracted from the fused multimodal representations and distributed to the clients in addition to the model parameters. Conversely, the clients with incomplete modalities calibrate their missing-modal representations toward the global full-modal anchors via scaled dot-product cross-attention, making up for the information loss due to absent modalities. FedMEPD is validated on the BraTS 2018 and 2020 multimodal brain tumor segmentation benchmarks. Results show that it outperforms various up-to-date methods for multimodal and personalized FL, and its novel designs are effective.

Federated Modality-specific Encoders and Partially Personalized Fusion Decoder for Multimodal Brain Tumor Segmentation

TL;DR

Abstract

Paper Structure (28 sections, 8 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 28 sections, 8 equations, 7 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Brain Tumor Segmentation with Multimodal MRI
FL with Data Heterogeneity
Multimodal FL in Medical Image Analysis
Method
Problem Definition
Framework Overview
Federated Modality-specific Encoders
Partially Personalized Fusion Decoder
Multi-Anchor Multimodal Representation
Localized Adaptive Calibration via Cross-Attention
Experiments and Results
Datasets and Experimental Setting
Implementation
...and 13 more sections

Figures (7)

Figure 1: (a) Example images of the four modalities in BraTS 2020 menze2014multimodal demonstrating severe intermodal heterogeneities, and corresponding tumor regions: blue: edema; yellow: enhancing tumor; and green: necrotic and non-enhancing tumor core. (b) To address the heterogeneities caused by different modal combinations across various sites, this work proposes adopting different aggregation strategies for modality-specific encoders (fully federated) and multi-modal fusion decoders (partially federated, partially personalized), to facilitate both common knowledge sharing and effective personalization.
Figure 2: Overview of the proposed FedMEPD framework. The server has four modality-specific encoders (one for each modality) and a multimodal fusion decoder, whose fused features are clustered to produce multimodal anchors. Each client has a fully federated modality-specific encoder for each modality and a partially federated (partially personalized) fusion decoder. A localized adaptive calibration via cross-attention (LACCA) module calibrates the clients’ missing-modal representations toward the server’s multimodal anchors.
Figure 3: Illustration of the proposed localized adaptive calibration via cross-attention (LACCA) module.
Figure 4: Experimental results on the test data in mDSC (%). *: $p<0.05$ comparing against our method in each column.
Figure 5: Experimental results on the test data in HD95 (pixel). *: $p<0.05$ comparing against our method in each column.
...and 2 more figures

Federated Modality-specific Encoders and Partially Personalized Fusion Decoder for Multimodal Brain Tumor Segmentation

TL;DR

Abstract

Federated Modality-specific Encoders and Partially Personalized Fusion Decoder for Multimodal Brain Tumor Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)