PAPER: Privacy-Preserving Convolutional Neural Networks using Low-Degree Polynomial Approximations and Structural Optimizations on Leveled FHE
Eduardo Chielle, Manaar Alam, Jinting Liu, Jovan Kascelan, Michail Maniatakos
TL;DR
This work tackles the challenge of privacy-preserving CNN inference under leveled FHE (LFHE) by drastically reducing multiplicative depth and avoiding bootstrapping. It introduces a quadratic activation with a penalty-based training regime to achieve the theoretical minimum depth for nonlinear activations, complemented by structural optimizations (Node Fusing, Weight Redistribution, Tower Reuse) and co-design techniques (data layout, slice/ensemble clustering) to enable deep models like ResNet-32 under LFHE. A key contribution is enabling ensemble polynomial networks within a single ciphertext via shared codebooks, recovering accuracy lost to low-degree polynomials. Empirically, the approach yields up to 4× faster private inference on CIFAR and Tiny-ImageNet than prior methods, with accuracy close to plaintext ReLU models, marking a significant step toward practical PPML deployments using LFHE.
Abstract
Recent work using Fully Homomorphic Encryption (FHE) has made non-interactive privacy-preserving inference of deep Convolutional Neural Networks (CNN) possible. However, the performance of these methods remain limited by their heavy reliance on bootstrapping, a costly FHE operation applied across multiple layers, severely slowing inference. Moreover, they depend on high-degree polynomial approximations of non-linear activations, which increase multiplicative depth and reduce accuracy by 2-5% compared to plaintext ReLU models. In this work, we close the accuracy gap between FHE-based non-interactive CNNs and their plaintext counterparts, while also achieving faster inference than existing methods. We propose a quadratic polynomial approximation of ReLU, which achieves the theoretical minimum multiplicative depth for non-linear activations, together with a penalty-based training strategy. We further introduce structural optimizations that reduce the required FHE levels in CNNs by a factor of five compared to prior work, allowing us to run deep CNN models under leveled FHE without bootstrapping. To further accelerate inference and recover accuracy typically lost with polynomial approximations, we introduce parameter clustering along with a joint strategy of data layout and ensemble techniques. Experiments with VGG and ResNet models on CIFAR and Tiny-ImageNet datasets show that our approach achieves up to $4\times$ faster private inference than prior work, with accuracy comparable to plaintext ReLU models.
