Table of Contents
Fetching ...

FedSCA: Federated Tuning with Similarity-guided Collaborative Aggregation for Heterogeneous Medical Image Segmentation

Yumin Zhang, Yan Gao, Haoran Duan, Hanqing Guo, Tejal Shah, Rajiv Ranjan, Bo Wei

TL;DR

This work tackles privacy-constrained, non-IID data challenges in medical image segmentation by federating fine-tuning of transformer-based foundation models. It introduces FedSCA, a three-part framework consisting of Federated Parameter-Efficient Learning (FEL), Low-level Adapter Transmission (LAT), and Similarity-Guided Collaborative Aggregation (SGCA). Through adapter-based PEFT, sharing only low-level adapters, and a server-side similarity-guided aggregation, FedSCA achieves robust performance with substantially reduced communication, delivering state-of-the-art results on three standard FL benchmarks. The approach enables scalable, privacy-preserving collaboration among hospitals, offering practical benefits for deploying foundation models in medical imaging.

Abstract

Transformer-based foundation models (FMs) have recently demonstrated remarkable performance in medical image segmentation. However, scaling these models is challenging due to the limited size of medical image datasets within isolated hospitals, where data centralization is restricted due to privacy concerns. These constraints, combined with the data-intensive nature of FMs, hinder their broader application. Integrating federated learning (FL) with foundation models (FLFM) fine-tuning offers a potential solution to these challenges by enabling collaborative model training without data sharing, thus allowing FMs to take advantage of a diverse pool of sensitive medical image data across hospitals/clients. However, non-independent and identically distributed (non-IID) data among clients, paired with computational and communication constraints in federated environments, presents an additional challenge that limits further performance improvements and remains inadequately addressed in existing studies. In this work, we propose a novel FLFM fine-tuning framework, \underline{\textbf{Fed}}erated tuning with \underline{\textbf{S}}imilarity-guided \underline{\textbf{C}}ollaborative \underline{\textbf{A}}ggregation (FedSCA), encompassing all phases of the FL process. This includes (1) specially designed parameter-efficient fine-tuning (PEFT) for local client training to enhance computational efficiency; (2) partial low-level adapter transmission for communication efficiency; and (3) similarity-guided collaborative aggregation (SGCA) on the server side to address non-IID issues. Extensive experiments on three FL benchmarks for medical image segmentation demonstrate the effectiveness of our proposed FedSCA, establishing new SOTA performance.

FedSCA: Federated Tuning with Similarity-guided Collaborative Aggregation for Heterogeneous Medical Image Segmentation

TL;DR

This work tackles privacy-constrained, non-IID data challenges in medical image segmentation by federating fine-tuning of transformer-based foundation models. It introduces FedSCA, a three-part framework consisting of Federated Parameter-Efficient Learning (FEL), Low-level Adapter Transmission (LAT), and Similarity-Guided Collaborative Aggregation (SGCA). Through adapter-based PEFT, sharing only low-level adapters, and a server-side similarity-guided aggregation, FedSCA achieves robust performance with substantially reduced communication, delivering state-of-the-art results on three standard FL benchmarks. The approach enables scalable, privacy-preserving collaboration among hospitals, offering practical benefits for deploying foundation models in medical imaging.

Abstract

Transformer-based foundation models (FMs) have recently demonstrated remarkable performance in medical image segmentation. However, scaling these models is challenging due to the limited size of medical image datasets within isolated hospitals, where data centralization is restricted due to privacy concerns. These constraints, combined with the data-intensive nature of FMs, hinder their broader application. Integrating federated learning (FL) with foundation models (FLFM) fine-tuning offers a potential solution to these challenges by enabling collaborative model training without data sharing, thus allowing FMs to take advantage of a diverse pool of sensitive medical image data across hospitals/clients. However, non-independent and identically distributed (non-IID) data among clients, paired with computational and communication constraints in federated environments, presents an additional challenge that limits further performance improvements and remains inadequately addressed in existing studies. In this work, we propose a novel FLFM fine-tuning framework, \underline{\textbf{Fed}}erated tuning with \underline{\textbf{S}}imilarity-guided \underline{\textbf{C}}ollaborative \underline{\textbf{A}}ggregation (FedSCA), encompassing all phases of the FL process. This includes (1) specially designed parameter-efficient fine-tuning (PEFT) for local client training to enhance computational efficiency; (2) partial low-level adapter transmission for communication efficiency; and (3) similarity-guided collaborative aggregation (SGCA) on the server side to address non-IID issues. Extensive experiments on three FL benchmarks for medical image segmentation demonstrate the effectiveness of our proposed FedSCA, establishing new SOTA performance.

Paper Structure

This paper contains 17 sections, 11 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of the communication process during FL training. Compared to high-level client-specific semantic features, low-level model updates transmitted in the proposed FedECA contain more shared knowledge among clients.
  • Figure 2: Pipeline of FedSCA, which mainly consists of three phases: federated parameter-efficient learning (FEL), low-level adapters transmission (LAT), and similarity-guided collaborative aggregation (SGCA) in one complete FL round. FEL: In the local training phase, only the parameters of inserted adapters are learnable. LAT: After local training, each client sends the low-lever adapter parameters to the server to share the universal representation knowledge among clients. SGCA: Based on the collaborative relationship matrix, the server tendentiously aggregates these uploaded parameters to encourage similar clients to learn more from each other.
  • Figure 3: During local training (epoch 1 - 10) on the FedPolyp dataset, the parameters of the lower layers shift less than those of the higher layers.
  • Figure 4: From top to bottom, visualized samples from Fed-Polyp wang2022personalizing, Fed-Fundus liu2021feddg, and Fed-Prostate liu2021feddg, respectively. Significant differences in image representation and the sample sizes can be observed between clients.
  • Figure 5: The IoU and Dice change against different $L$ values on the Fed-Prostate dataset.
  • ...and 2 more figures