Federated Learning with Layer Skipping: Efficient Training of Large Language Models for Healthcare NLP
Lihong Zhang, Yue Li
TL;DR
This paper tackles privacy-preserving collaboration for healthcare NLP by introducing Layer-Skipping Federated Learning, which freezes most layers of a pre-trained LLM and only fine-tunes a subset to reduce communication. Using LLaMA 3.2-1B, the method achieves substantial bandwidth savings (≈$70\%$) while preserving near-centralized performance (within $2\%$) on i2b2 and MIMIC-III, and exhibits robustness to non-IID data. The approach integrates with differential privacy and secure aggregation to strengthen privacy guarantees, and ablation studies show an 8-layer configuration as the best trade-off between accuracy and communication. The work demonstrates practical implications for deploying large language models across resource-constrained healthcare institutions, offering improved efficiency, privacy, and convergence speed over several baselines.
Abstract
Federated learning (FL) enables collaborative model training across organizations without sharing raw data, addressing crucial privacy concerns in healthcare natural language processing (NLP). However, training large language models (LLMs) in federated settings faces significant challenges, including communication overhead and data heterogeneity. We propose Layer-Skipping Federated Learning, where only selected layers of a pre-trained LLM are fine-tuned across clients while others remain frozen. Applied to LLaMA 3.2-1B, our approach reduces communication costs by approximately 70% while maintaining performance within 2% of centralized training. We evaluate our method on clinical NER and classification tasks using i2b2 and MIMIC-III datasets. Our experiments demonstrate that Layer-Skipping FL outperforms competitive baselines, handles non-IID clinical data distributions effectively, and shows robustness when combined with differential privacy. This approach represents a practical solution for privacy-preserving collaborative learning in healthcare NLP.
