Simplified Swarm Learning Framework for Robust and Scalable Diagnostic Services in Cancer Histopathology
Yanjie Wu, Yuhao Ji, Saiho Lee, Juniad Akram, Ali Braytee, Ali Anaissi
TL;DR
The paper tackles privacy and interoperability challenges in healthcare AI by replacing blockchain-based Swarm Learning with a blockchain-free, dynamic peer-to-peer Swarm Learning (P2P-SL) framework. It combines decentralized training, secure TLS/gRPC communications, and adaptive, locally aggregated updates to enable robust learning in imbalanced histopathology datasets, using TorchXRayVision with a DenseNet decoder. Key contributions include dynamic network discovery, peer-based weight aggregation, and domain-specific model enhancements, with extensive experiments showing competitive performance to centralized baselines while preserving data privacy. The framework demonstrates improved generalization, resilience to data scarcity, and practical applicability for privacy-sensitive diagnostic tasks in resource-constrained environments, thereby broadening accessibility and scalability of advanced AI in healthcare.
Abstract
The complexities of healthcare data, including privacy concerns, imbalanced datasets, and interoperability issues, necessitate innovative machine learning solutions. Swarm Learning (SL), a decentralized alternative to Federated Learning, offers privacy-preserving distributed training, but its reliance on blockchain technology hinders accessibility and scalability. This paper introduces a \textit{Simplified Peer-to-Peer Swarm Learning (P2P-SL) Framework} tailored for resource-constrained environments. By eliminating blockchain dependencies and adopting lightweight peer-to-peer communication, the proposed framework ensures robust model synchronization while maintaining data privacy. Applied to cancer histopathology, the framework integrates optimized pre-trained models, such as TorchXRayVision, enhanced with DenseNet decoders, to improve diagnostic accuracy. Extensive experiments demonstrate the framework's efficacy in handling imbalanced and biased datasets, achieving comparable performance to centralized models while preserving privacy. This study paves the way for democratizing advanced machine learning in healthcare, offering a scalable, accessible, and efficient solution for privacy-sensitive diagnostic applications.
