NeuJeans: Private Neural Network Inference with Joint Optimization of Convolution and FHE Bootstrapping
Jae Hyung Ju, Jaiyoung Park, Jongmin Kim, Minsik Kang, Donghwan Kim, Jung Hee Cheon, Jung Ho Ahn
TL;DR
NeuJeans tackles private neural network inference using CKKS-based fully homomorphic encryption by introducing CinS encoding, which enables efficient conv2d under encryption. The approach combines partial DFT properties with a novel data packing strategy to realize local convolutions within encrypted slices, reducing the reliance on costly rotations and bootstrapping. It also fuses conv2d computations with bootstrapping steps to minimize online complexity and proposes FHE-friendly execution flows, including downsampling and depthwise convolutions, to maintain high throughput. Empirically, NeuJeans achieves up to 5.68× speedups over prior FHE-based PI methods and demonstrates end-to-end CNN inference on ImageNet in seconds for models like ResNet18 and tens of seconds for larger networks, indicating practical viability for private inference services.
Abstract
Fully homomorphic encryption (FHE) is a promising cryptographic primitive for realizing private neural network inference (PI) services by allowing a client to fully offload the inference task to a cloud server while keeping the client data oblivious to the server. This work proposes NeuJeans, an FHE-based solution for the PI of deep convolutional neural networks (CNNs). NeuJeans tackles the critical problem of the enormous computational cost for the FHE evaluation of CNNs. We introduce a novel encoding method called Coefficients-in-Slot (CinS) encoding, which enables multiple convolutions in one HE multiplication without costly slot permutations. We further observe that CinS encoding is obtained by conducting the first several steps of the Discrete Fourier Transform (DFT) on a ciphertext in conventional Slot encoding. This property enables us to save the conversion between CinS and Slot encodings as bootstrapping a ciphertext starts with DFT. Exploiting this, we devise optimized execution flows for various two-dimensional convolution (conv2d) operations and apply them to end-to-end CNN implementations. NeuJeans accelerates the performance of conv2d-activation sequences by up to 5.68 times compared to state-of-the-art FHE-based PI work and performs the PI of a CNN at the scale of ImageNet within a mere few seconds.
