Bicoptor: Two-round Secure Three-party Non-linear Computation without Preprocessing for Privacy-preserving Machine Learning
Lijing Zhou, Ziyu Wang, Hongrui Cui, Qingrui Song, Yu Yu
TL;DR
The paper tackles the bottleneck of non-linear computations in MPC-based privacy-preserving ML by introducing Bicoptor, a family of two-round 3PC protocols without preprocessing. Its core is a novel sign-detection protocol derived from a probabilistic truncation technique that handles unknown bit-lengths with repeated TRC evaluations and careful error analysis. The authors extend DReLU to ReLU, ABS, PLU, MAX/MIN/SORT/MED, and MAX, delivering GPU-friendly, constant-two-round protocols and achieving substantial throughput gains in LAN experiments (e.g., up to ~390k DReLU and ~370k ReLU ops/s; ~110k MAX4 and ~41k MAX9). The work reduces online PPML overhead and eliminates preprocessing costs, enabling scalable PPML deployments on public clouds.
Abstract
The overhead of non-linear functions dominates the performance of the secure multiparty computation (MPC) based privacy-preserving machine learning (PPML). This work introduces a family of novel secure three-party computation (3PC) protocols, Bicoptor, which improve the efficiency of evaluating non-linear functions. The basis of Bicoptor is a new sign determination protocol, which relies on a clever use of the truncation protocol proposed in SecureML (S\&P 2017). Our 3PC sign determination protocol only requires two communication rounds, and does not involve any preprocessing. Such sign determination protocol is well-suited for computing non-linear functions in PPML, e.g. the activation function ReLU, Maxpool, and their variants. We develop suitable protocols for these non-linear functions, which form a family of GPU-friendly protocols, Bicoptor. All Bicoptor protocols only require two communication rounds without preprocessing. We evaluate Bicoptor under a 3-party LAN network over a public cloud, and achieve more than 370,000 DReLU/ReLU or 41,000 Maxpool (find the maximum value of nine inputs) operations per second. Under the same settings and environment, our ReLU protocol has a one or even two orders of magnitude improvement to the state-of-the-art works, Falcon (PETS 2021) or Edabits (CRYPTO 2020), respectively without batch processing.
