FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data

Rasoul Jafari Gohari; Laya Aliahmadipour; Ezat Valipour

FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data

Rasoul Jafari Gohari, Laya Aliahmadipour, Ezat Valipour

TL;DR

Brain tumor MRI classification faces privacy and communication challenges in traditional deep learning pipelines. FedBrain-Distill proposes a communication-efficient federated knowledge-distillation framework that uses an ensemble of VGGNet16 teachers on private data to generate soft labels on a public dataset, which a lightweight student learns from via a distillation objective. The student optimizes a total loss $\mathcal{L}_{total}=\alpha\mathcal{L}_{student}+(1-\alpha)\mathcal{L}_{distill}$ with $P_{agg}=\frac{1}{T}\sum_t P_t$ and $P_t=\sigma_T(f_{\theta_t^*}(X_{public}))$, leveraging different temperatures $T$ for IID and non-IID settings. Experiments on the Figshare brain tumor dataset under IID and non-IID Dirichlet partitions show competitive accuracy with far reduced communication, demonstrating the viability of architecture-agnostic FL via ensemble KD for privacy-sensitive medical imaging.

Abstract

Brain is one the most complex organs in the human body. Due to its complexity, classification of brain tumors still poses a significant challenge, making brain tumors a particularly serious medical issue. Techniques such as Machine Learning (ML) coupled with Magnetic Resonance Imaging (MRI) have paved the way for doctors and medical institutions to classify different types of tumors. However, these techniques suffer from limitations that violate patients privacy. Federated Learning (FL) has recently been introduced to solve such an issue, but the FL itself suffers from limitations like communication costs and dependencies on model architecture, forcing all models to have identical architectures. In this paper, we propose FedBrain-Distill, an approach that leverages Knowledge Distillation (KD) in an FL setting that maintains the users privacy and ensures the independence of FL clients in terms of model architecture. FedBrain-Distill uses an ensemble of teachers that distill their knowledge to a simple student model. The evaluation of FedBrain-Distill demonstrated high-accuracy results for both Independent and Identically Distributed (IID) and non-IID data with substantial low communication costs on the real-world Figshare brain tumor dataset. It is worth mentioning that we used Dirichlet distribution to partition the data into IID and non-IID data. All the implementation details are accessible through our Github repository.

FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data

TL;DR

with

and

, leveraging different temperatures

for IID and non-IID settings. Experiments on the Figshare brain tumor dataset under IID and non-IID Dirichlet partitions show competitive accuracy with far reduced communication, demonstrating the viability of architecture-agnostic FL via ensemble KD for privacy-sensitive medical imaging.

Abstract

Paper Structure (11 sections, 9 equations, 7 figures, 3 tables)

This paper contains 11 sections, 9 equations, 7 figures, 3 tables.

Introduction
Related Work
Proposed Method
Preprocessing
Data Partitioning
Public and Private Dataset
Teacher Models
Student Model
Student Model Architecture
Experimental Results
Conclusion

Figures (7)

Figure 1: Federated learning overview, showing the aggregation of clients' weights with the global model in the aggregator/server.c15.
Figure 2: FedBrain-Distill workflow: The student model uses public dataset and resides in the aggregator while the teacher models use their private dataset to share their knowledge with the student using knowledge distillation. The aggregated soft labels obtained from teachers' softmax layers are distilled into the student model using Kullback-Leiber divergence. The more rounds of training there are, the less the divergence between the student and the teachers becomes.
Figure 3: Comparison of normalized and enhanced tumor image before and after applying CLAHE technique.
Figure 4: Student model accuracy after 10 communication rounds on non-IID data with 2 and 5 teachers.
Figure 5: Student model accuracy after 10 communication rounds on IID data with 2 and 5 teachers.
...and 2 more figures

FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data

TL;DR

Abstract

FedBrain-Distill: Communication-Efficient Federated Brain Tumor Classification Using Ensemble Knowledge Distillation on Non-IID Data

Authors

TL;DR

Abstract

Table of Contents

Figures (7)