MPC-Pipe: an Efficient Pipeline Scheme for Secure Multi-party Machine Learning Inference
Yongqin Wang, Rachit Rajat, Murali Annavaram
TL;DR
MPC-Pipe tackles the overheads of secure MPC for ML by identifying and removing unnecessary serialization between computation and communication in secret-sharing MPC. It introduces three pipeline schemes—inter-linear, inner-layer, and inter-batch—to overlap online-phase operations during inference and training, implemented on a modified CrypTen with separated offline/online phases. Empirical evaluation on CNNs and Transformers shows substantial end-to-end gains in throughput (up to ~50%) and latency reductions (up to ~16%) across LAN and WAN settings, with additional improvements in resource utilization and scalability to more parties. The approach preserves accuracy, requires no changes to MPC protocols, and leverages domain-specific data-flow characteristics to achieve practical, end-to-end improvements for privacy-preserving ML workloads.
Abstract
Multi-party computing (MPC) has been gaining popularity as a secure computing model over the past few years. However, prior works have demonstrated that MPC protocols still pay substantial performance penalties compared to plaintext, particularly when applied to ML algorithms. The overhead is due to added computation and communication costs. Prior studies, as well as our own analysis, found that most MPC protocols today sequentially perform communication and computation. The participating parties must compute on their shares first and then perform data communication to allow the distribution of new secret shares before proceeding to the next computation step. In this work, we show that serialization is unnecessary, particularly in the context of ML computations (both in Convolutional neural networks and in Transformer-based models). We demonstrate that it is possible to carefully orchestrate the computation and communication steps to overlap. We propose MPC-Pipe, an efficient MPC system for both training and inference of ML workloads, which pipelines computations and communications in an MPC protocol during the online phase. MPC-Pipe proposes three pipeline schemes to optimize the online phase of ML in the semi-honest majority adversary setting. We implement MPC-Pipe by augmenting a modified version of CrypTen, which separates online and offline phases. We evaluate the end-to-end system performance benefits of the online phase of MPC using deep neural networks (VGG16, ResNet50) and Transformers using different network settings. We show that MPC-Pipe can improve the throughput and latency of ML workloads.
