Parallel Split Learning with Global Sampling
Mohammad Kohankhaki, Ahmad Ayad, Mahdi Barhoush, Anke Schmeink
TL;DR
GPSL is a server-driven scheme that fixes the global batch size while computing per-client batch-size schedules using pooled-level proportions and obtains finite-population deviation guarantees via Serfling's inequality, yielding a zero rounding bias compared to local sampling schemes.
Abstract
Distributed deep learning in resource-constrained environments faces scalability and generalization challenges due to large effective batch sizes and non-identically distributed client data. We introduce a server-driven sampling strategy that maintains a fixed global batch size by dynamically adjusting client-side batch sizes. This decouples the effective batch size from the number of participating devices and ensures that global batches better reflect the overall data distribution. Using standard concentration bounds, we establish tighter deviation guarantees compared to existing approaches. Empirical results on a benchmark dataset confirm that the proposed method improves model accuracy, training efficiency, and convergence stability, offering a scalable solution for learning at the network edge.
