Table of Contents
Fetching ...

Communication-Computation Pipeline Parallel Split Learning over Wireless Edge Networks

Chenyu Liu, Zhaoyang Zhang, Zirui Chen, Zhaohui Yang

TL;DR

This work tackles the latency and privacy challenges of training neural networks in wireless edge networks by integrating pipeline parallelism into split learning, forming C^2P^2SL. It introduces a joint optimization framework that simultaneously selects the model cut layer, micro-batch count, per-user batch sizes, and TDMA time-slot allocations to minimize bubble time and maximize training efficiency. An alternating-optimization procedure decomposes the nonconvex MINLP into tractable subproblems, solved via MILP and convex optimization, and experiments show substantial reductions in training time (over 38% on average) with preserved convergence accuracy across varying bandwidths. The approach demonstrates robust performance under heterogenous UE capabilities and provides a practical blueprint for scalable, privacy-preserving edge learning with pipeline scheduling.

Abstract

Split learning (SL) offloads main computing tasks from multiple resource-constrained user equippments (UEs) to the base station (BS), while preserving local data privacy. However, its computation and communication processes remain sequential, resulting in limited system efficiency. To overcome this limitation, this paper applies pipeline parallelism (PP) of distributed training to SL in wireless networks, proposing the so-called communication-computation pipeline parallel split learning (C$^2$P$^2$SL). By considering the communicating and computing processes of UEs and BS as an overall pipeline, C$^2$P$^2$SL achieves pipeline parallelization among different micro-batches which are split from each batch of data samples. The overlap of communication and computation in this way significantly reduces the total training time. Given that training efficiency is affected by position of cutting layer and heterogeneity of the UEs, we formulate a joint optimization problem of task split and resource allocation, and design a solution based on alternating optimization. Experimental results demonstrate that C$^2$P$^2$SL significantly reduces system training time by over 38\% while maintaining convergence accuracy under different communication conditions.

Communication-Computation Pipeline Parallel Split Learning over Wireless Edge Networks

TL;DR

This work tackles the latency and privacy challenges of training neural networks in wireless edge networks by integrating pipeline parallelism into split learning, forming C^2P^2SL. It introduces a joint optimization framework that simultaneously selects the model cut layer, micro-batch count, per-user batch sizes, and TDMA time-slot allocations to minimize bubble time and maximize training efficiency. An alternating-optimization procedure decomposes the nonconvex MINLP into tractable subproblems, solved via MILP and convex optimization, and experiments show substantial reductions in training time (over 38% on average) with preserved convergence accuracy across varying bandwidths. The approach demonstrates robust performance under heterogenous UE capabilities and provides a practical blueprint for scalable, privacy-preserving edge learning with pipeline scheduling.

Abstract

Split learning (SL) offloads main computing tasks from multiple resource-constrained user equippments (UEs) to the base station (BS), while preserving local data privacy. However, its computation and communication processes remain sequential, resulting in limited system efficiency. To overcome this limitation, this paper applies pipeline parallelism (PP) of distributed training to SL in wireless networks, proposing the so-called communication-computation pipeline parallel split learning (CPSL). By considering the communicating and computing processes of UEs and BS as an overall pipeline, CPSL achieves pipeline parallelization among different micro-batches which are split from each batch of data samples. The overlap of communication and computation in this way significantly reduces the total training time. Given that training efficiency is affected by position of cutting layer and heterogeneity of the UEs, we formulate a joint optimization problem of task split and resource allocation, and design a solution based on alternating optimization. Experimental results demonstrate that CPSL significantly reduces system training time by over 38\% while maintaining convergence accuracy under different communication conditions.

Paper Structure

This paper contains 20 sections, 1 theorem, 23 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

For every fixed $(l,\boldsymbol{b},\boldsymbol{\tau})$, the optimal micro-batch number $k=\left \lfloor\frac{1}{1-\eta}\right \rfloor, \eta=\mathop{\max}\limits_i\frac{\tau_ib\sum_{j=l+1}^L(c_j^F+c_j^B)}{b_iTf_B\left((s_l+s_0)/r_u^i+s_l/r_d^i\right)}$.

Figures (5)

  • Figure 1: The proposed C$^2$P$^2$SL over edge networks.
  • Figure 2: The training workflow of C$^2$P$^2$SL.
  • Figure 3: Test accuracy of various schemes with n=8.
  • Figure 4: Training convergence time of various schemes under different numbers of UEs.
  • Figure 5: Training convergence time of various schemes versus the system bandwidth with n=8.

Theorems & Definitions (1)

  • Lemma 1