Table of Contents
Fetching ...

Neural Networks with (Low-Precision) Polynomial Approximations: New Insights and Techniques for Accuracy Improvement

Chi Zhang, Jingjing Fan, Man Ho Au, Siu Ming Yiu

TL;DR

This work analyzes polynomial-approximation neural networks (PANN) for privacy-preserving inference, showing that approximation errors are not uniformly benign and that negative ReLU inputs are particularly sensitive. It provides two theorems bounding loss growth from approximation noise and weight decay, offering a theoretical basis for improving PANN sturdiness. The authors propose Negative Inputs Leakage (NGNV) and a strategy of minimal weight regularization with Mixup to enhance robustness, achieving substantial accuracy gains at the same precision and enabling lower-precision requirements (e.g., moving from $2^{-12}$ to $2^{-9}$) on models like ResNet-20 for CIFAR-10. Empirical results across multiple architectures and datasets demonstrate improved PANN performance, compatibility with ReLU-replacement and fixed-point protocols, and meaningful reductions in computation time and communication overhead, highlighting practical impact for PPML deployments.

Abstract

Replacing non-polynomial functions (e.g., non-linear activation functions such as ReLU) in a neural network with their polynomial approximations is a standard practice in privacy-preserving machine learning. The resulting neural network, called polynomial approximation of neural network (PANN) in this paper, is compatible with advanced cryptosystems to enable privacy-preserving model inference. Using ``highly precise'' approximation, state-of-the-art PANN offers similar inference accuracy as the underlying backbone model. However, little is known about the effect of approximation, and existing literature often determined the required approximation precision empirically. In this paper, we initiate the investigation of PANN as a standalone object. Specifically, our contribution is two-fold. Firstly, we provide an explanation on the effect of approximate error in PANN. In particular, we discovered that (1) PANN is susceptible to some type of perturbations; and (2) weight regularisation significantly reduces PANN's accuracy. We support our explanation with experiments. Secondly, based on the insights from our investigations, we propose solutions to increase inference accuracy for PANN. Experiments showed that combination of our solutions is very effective: at the same precision, our PANN is 10% to 50% more accurate than state-of-the-arts; and at the same accuracy, our PANN only requires a precision of 2^{-9} while state-of-the-art solution requires a precision of 2^{-12} using the ResNet-20 model on CIFAR-10 dataset.

Neural Networks with (Low-Precision) Polynomial Approximations: New Insights and Techniques for Accuracy Improvement

TL;DR

This work analyzes polynomial-approximation neural networks (PANN) for privacy-preserving inference, showing that approximation errors are not uniformly benign and that negative ReLU inputs are particularly sensitive. It provides two theorems bounding loss growth from approximation noise and weight decay, offering a theoretical basis for improving PANN sturdiness. The authors propose Negative Inputs Leakage (NGNV) and a strategy of minimal weight regularization with Mixup to enhance robustness, achieving substantial accuracy gains at the same precision and enabling lower-precision requirements (e.g., moving from to ) on models like ResNet-20 for CIFAR-10. Empirical results across multiple architectures and datasets demonstrate improved PANN performance, compatibility with ReLU-replacement and fixed-point protocols, and meaningful reductions in computation time and communication overhead, highlighting practical impact for PPML deployments.

Abstract

Replacing non-polynomial functions (e.g., non-linear activation functions such as ReLU) in a neural network with their polynomial approximations is a standard practice in privacy-preserving machine learning. The resulting neural network, called polynomial approximation of neural network (PANN) in this paper, is compatible with advanced cryptosystems to enable privacy-preserving model inference. Using ``highly precise'' approximation, state-of-the-art PANN offers similar inference accuracy as the underlying backbone model. However, little is known about the effect of approximation, and existing literature often determined the required approximation precision empirically. In this paper, we initiate the investigation of PANN as a standalone object. Specifically, our contribution is two-fold. Firstly, we provide an explanation on the effect of approximate error in PANN. In particular, we discovered that (1) PANN is susceptible to some type of perturbations; and (2) weight regularisation significantly reduces PANN's accuracy. We support our explanation with experiments. Secondly, based on the insights from our investigations, we propose solutions to increase inference accuracy for PANN. Experiments showed that combination of our solutions is very effective: at the same precision, our PANN is 10% to 50% more accurate than state-of-the-arts; and at the same accuracy, our PANN only requires a precision of 2^{-9} while state-of-the-art solution requires a precision of 2^{-12} using the ResNet-20 model on CIFAR-10 dataset.
Paper Structure (29 sections, 11 theorems, 55 equations, 5 figures, 12 tables, 1 algorithm)

This paper contains 29 sections, 11 theorems, 55 equations, 5 figures, 12 tables, 1 algorithm.

Key Result

Theorem 2.1

Let two convex function $h_1: \mathbb{R} \to \mathbb{R}$ and $h_2: \mathbb{R} \to \mathbb{R}$ differentiable in $(-\infty,0)\cup(0,\infty)$. Let $\overline{z}_1 \in \mathbb{R}$ and $\overline{z}_2 \in \mathbb{R}$ and $\overline{z}_1<0, \overline{z}_2>0$, such that $h_1'(\overline{z}_1))=h_2'(\overli

Figures (5)

  • Figure 1: Testing loss increment (compared to baseline) of PANN lee2022low approximating ReLU with positive inputs (pos) and negative inputs (neg) on models trained with weight decay (wd) 0 and 1e-4. It shows that weight decay can amplify the loss increment caused by approximation. Additionally, approximating ReLU with negative inputs results in more loss compared to approximating ReLU with positive inputs.
  • Figure 2: Backbone models (bb) trained with different weight decay (wd) have similar accuracy (dot line). However, large weight decay can reduce their accuracy on PANN as the epoch increases. (ResNet-18, CIFAR-100, precision $2^{-9}$)
  • Figure 3: Approximation errors for the Minimax approximation of $\operatorname{sgn}(z)$ on $[-1,-\epsilon]$ with precision $2^{-10}$ (left) and $2^{-14}$ (right)
  • Figure 4: Time cost for PANN with our models and the non-fine-tuning SOTA lee2021privacypreservinglee2022low (to achieve the SOTA accuracy on CIFAR-10).
  • Figure 5: The benign samples and perturbations $\delta$ only against PANN. $\delta$ can concentrate on image backgrounds (ResNet-18, Imagenet, Precision $2^{-14}$)

Theorems & Definitions (15)

  • Theorem 2.1
  • Theorem 2.2
  • Theorem B.1
  • Lemma B.2
  • Lemma B.3: Increased loss for $\overline{z}<0$
  • proof
  • Lemma B.4: Increased loss for $\overline{z}>0$
  • Theorem B.5
  • Lemma B.6
  • proof
  • ...and 5 more