Table of Contents
Fetching ...

Secure multiparty computations in floating-point arithmetic

Chuan Guo, Awni Hannun, Brian Knott, Laurens van der Maaten, Mark Tygert, Ruiyu Zhu

TL;DR

This work presents a practical framework for secure multiparty computation in floating-point arithmetic, enabling privacy-preserving machine learning without resorting to modular arithmetic. It combines additive sharing and Beaver multiplication, with rigorous information-leakage bounds and numerical stability analyses, to operate on standard double-precision hardware. The authors develop polynomial approximation techniques (Newton iterations, Chebyshev series, and softmax scaling) to securely compute common ML functions, and validate the approach on synthetic data and real datasets (MNIST, covtype, and horsekicks) using CrypTen on PyTorch. The results show near-plaintext accuracy and generalization, while quantifying controlled leakage and demonstrating feasible performance on commodity hardware, highlighting the method’s practical potential for privacy-preserving analytics.

Abstract

Secure multiparty computations enable the distribution of so-called shares of sensitive data to multiple parties such that the multiple parties can effectively process the data while being unable to glean much information about the data (at least not without collusion among all parties to put back together all the shares). Thus, the parties may conspire to send all their processed results to a trusted third party (perhaps the data provider) at the conclusion of the computations, with only the trusted third party being able to view the final results. Secure multiparty computations for privacy-preserving machine-learning turn out to be possible using solely standard floating-point arithmetic, at least with a carefully controlled leakage of information less than the loss of accuracy due to roundoff, all backed by rigorous mathematical proofs of worst-case bounds on information loss and numerical stability in finite-precision arithmetic. Numerical examples illustrate the high performance attained on commodity off-the-shelf hardware for generalized linear models, including ordinary linear least-squares regression, binary and multinomial logistic regression, probit regression, and Poisson regression.

Secure multiparty computations in floating-point arithmetic

TL;DR

This work presents a practical framework for secure multiparty computation in floating-point arithmetic, enabling privacy-preserving machine learning without resorting to modular arithmetic. It combines additive sharing and Beaver multiplication, with rigorous information-leakage bounds and numerical stability analyses, to operate on standard double-precision hardware. The authors develop polynomial approximation techniques (Newton iterations, Chebyshev series, and softmax scaling) to securely compute common ML functions, and validate the approach on synthetic data and real datasets (MNIST, covtype, and horsekicks) using CrypTen on PyTorch. The results show near-plaintext accuracy and generalization, while quantifying controlled leakage and demonstrating feasible performance on commodity hardware, highlighting the method’s practical potential for privacy-preserving analytics.

Abstract

Secure multiparty computations enable the distribution of so-called shares of sensitive data to multiple parties such that the multiple parties can effectively process the data while being unable to glean much information about the data (at least not without collusion among all parties to put back together all the shares). Thus, the parties may conspire to send all their processed results to a trusted third party (perhaps the data provider) at the conclusion of the computations, with only the trusted third party being able to view the final results. Secure multiparty computations for privacy-preserving machine-learning turn out to be possible using solely standard floating-point arithmetic, at least with a carefully controlled leakage of information less than the loss of accuracy due to roundoff, all backed by rigorous mathematical proofs of worst-case bounds on information loss and numerical stability in finite-precision arithmetic. Numerical examples illustrate the high performance attained on commodity off-the-shelf hardware for generalized linear models, including ordinary linear least-squares regression, binary and multinomial logistic regression, probit regression, and Poisson regression.

Paper Structure

This paper contains 25 sections, 7 theorems, 76 equations, 13 figures, 6 tables.

Key Result

Theorem 1

Suppose that $X$ and $Y$ are independent scalar random variables and $\beta$ and $\gamma$ are positive real numbers such that $|X| \le \beta < \gamma$ and $Y$ is distributed uniformly over $[-\gamma, \gamma]$. Then, the information leaked about $X$ from observing $X+Y$ satisfies where $I$ denotes the mutual information between $X$ and $X+Y$, measured in bits (not nats); the mutual information sat

Figures (13)

  • Figure 1: Relative error in computation of $1/x$ with 30 iterations of (\ref{['inverse']})
  • Figure 2: Relative error in computation of $1/\sqrt{x}$ with 26 iterations of (\ref{['invs']})
  • Figure 3: Relative error in computation of $x^{-1/8}$ with 24 iterations of (\ref{['inv8']})
  • Figure 4: Absolute error in computation of $|x| = x \mathop{\mathrm{sgn}}(x)$ with 60 iterations of (\ref{['sgn']}); the figure superimposes a white curve over a black curve, where the white curve uses $y_0 = -x / \gamma$ to start (\ref{['sgn']}) while the black curve uses $y_0 = x / \gamma$, both with $\gamma = 10^5$
  • Figure 5: Euclidean norm of the difference between the ideal normalized weight vector $w/\|w\|_2$ and its computed approximation $x/\|x\|_2$ as a function of the width $\gamma$ of the uniform noise on $[-\gamma, \gamma]$ added to the shares of data (the lines for the logit and probit links overlap)
  • ...and 8 more figures

Theorems & Definitions (9)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Corollary 5
  • Lemma 6
  • proof
  • Lemma 7
  • proof