Table of Contents
Fetching ...

Bicoptor: Two-round Secure Three-party Non-linear Computation without Preprocessing for Privacy-preserving Machine Learning

Lijing Zhou, Ziyu Wang, Hongrui Cui, Qingrui Song, Yu Yu

TL;DR

The paper tackles the bottleneck of non-linear computations in MPC-based privacy-preserving ML by introducing Bicoptor, a family of two-round 3PC protocols without preprocessing. Its core is a novel sign-detection protocol derived from a probabilistic truncation technique that handles unknown bit-lengths with repeated TRC evaluations and careful error analysis. The authors extend DReLU to ReLU, ABS, PLU, MAX/MIN/SORT/MED, and MAX, delivering GPU-friendly, constant-two-round protocols and achieving substantial throughput gains in LAN experiments (e.g., up to ~390k DReLU and ~370k ReLU ops/s; ~110k MAX4 and ~41k MAX9). The work reduces online PPML overhead and eliminates preprocessing costs, enabling scalable PPML deployments on public clouds.

Abstract

The overhead of non-linear functions dominates the performance of the secure multiparty computation (MPC) based privacy-preserving machine learning (PPML). This work introduces a family of novel secure three-party computation (3PC) protocols, Bicoptor, which improve the efficiency of evaluating non-linear functions. The basis of Bicoptor is a new sign determination protocol, which relies on a clever use of the truncation protocol proposed in SecureML (S\&P 2017). Our 3PC sign determination protocol only requires two communication rounds, and does not involve any preprocessing. Such sign determination protocol is well-suited for computing non-linear functions in PPML, e.g. the activation function ReLU, Maxpool, and their variants. We develop suitable protocols for these non-linear functions, which form a family of GPU-friendly protocols, Bicoptor. All Bicoptor protocols only require two communication rounds without preprocessing. We evaluate Bicoptor under a 3-party LAN network over a public cloud, and achieve more than 370,000 DReLU/ReLU or 41,000 Maxpool (find the maximum value of nine inputs) operations per second. Under the same settings and environment, our ReLU protocol has a one or even two orders of magnitude improvement to the state-of-the-art works, Falcon (PETS 2021) or Edabits (CRYPTO 2020), respectively without batch processing.

Bicoptor: Two-round Secure Three-party Non-linear Computation without Preprocessing for Privacy-preserving Machine Learning

TL;DR

The paper tackles the bottleneck of non-linear computations in MPC-based privacy-preserving ML by introducing Bicoptor, a family of two-round 3PC protocols without preprocessing. Its core is a novel sign-detection protocol derived from a probabilistic truncation technique that handles unknown bit-lengths with repeated TRC evaluations and careful error analysis. The authors extend DReLU to ReLU, ABS, PLU, MAX/MIN/SORT/MED, and MAX, delivering GPU-friendly, constant-two-round protocols and achieving substantial throughput gains in LAN experiments (e.g., up to ~390k DReLU and ~370k ReLU ops/s; ~110k MAX4 and ~41k MAX9). The work reduces online PPML overhead and eliminates preprocessing costs, enabling scalable PPML deployments on public clouds.

Abstract

The overhead of non-linear functions dominates the performance of the secure multiparty computation (MPC) based privacy-preserving machine learning (PPML). This work introduces a family of novel secure three-party computation (3PC) protocols, Bicoptor, which improve the efficiency of evaluating non-linear functions. The basis of Bicoptor is a new sign determination protocol, which relies on a clever use of the truncation protocol proposed in SecureML (S\&P 2017). Our 3PC sign determination protocol only requires two communication rounds, and does not involve any preprocessing. Such sign determination protocol is well-suited for computing non-linear functions in PPML, e.g. the activation function ReLU, Maxpool, and their variants. We develop suitable protocols for these non-linear functions, which form a family of GPU-friendly protocols, Bicoptor. All Bicoptor protocols only require two communication rounds without preprocessing. We evaluate Bicoptor under a 3-party LAN network over a public cloud, and achieve more than 370,000 DReLU/ReLU or 41,000 Maxpool (find the maximum value of nine inputs) operations per second. Under the same settings and environment, our ReLU protocol has a one or even two orders of magnitude improvement to the state-of-the-art works, Falcon (PETS 2021) or Edabits (CRYPTO 2020), respectively without batch processing.
Paper Structure (27 sections, 7 theorems, 31 equations, 7 figures, 11 tables, 6 algorithms)

This paper contains 27 sections, 7 theorems, 31 equations, 7 figures, 11 tables, 6 algorithms.

Key Result

Lemma 1

In a ring $\mathbb{Z}_q$, let $x\in[0,2^{\ell_x})\bigcup(q-2^{\ell_x},q)$, where $\ell>\ell_x + 1$. Then we have the following results with probability $1-2^{\ell_x + 1 - \ell}$:

Figures (7)

  • Figure 1: The CryptGPU experiments are named by the dataset_modelname_batchsize, e.g., Tiny_vgg16_8 corresponds to the vgg16 model trained from the Tiny dataset, and the inference runs with a batch size of 8. The percentage after the name reflect the ratio of the ReLU latency to the total latency, e.g., ReLU spends 32% of latency among the Tiny_vgg16_8 experiment in the local environment.
  • Figure 2: System settings.
  • Figure 3: Strawman-DReLU and DReLU protocol overview.
  • Figure 4: Strawman-ReLU and ReLU protocol overview.
  • Figure 5: MAX example.
  • ...and 2 more figures

Theorems & Definitions (7)

  • Lemma 1
  • Theorem 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Lemma 6