Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs
Aidan Ferguson, Perry Gibson, Lara D'Agata, Parker McLeod, Ferhat Yaman, Amitabh Das, Ian Colbert, José Cano
TL;DR
This work tackles the high overhead of fully homomorphic encryption for DNN inference by exploiting unstructured sparsity in matrix multiplication. It introduces three sparse FHE matmul schemes (Naïve Sparse, CSR, ELLPACK) and a CPU multi-threading approach within the CKKS framework to accelerate encrypted matrix operations while maintaining accuracy. The results show notable gains, including a $2.5×$ average speedup at 50% sparsity and up to $32.5×$ with 64 cores, along with memory savings and per-element error below $10^{-3}$. These findings enable more practical privacy-preserving DNN inference on commodity hardware and are supported by an open-source implementation.
Abstract
The deployment of deep neural networks (DNNs) in privacy-sensitive environments is constrained by computational overheads in fully homomorphic encryption (FHE). This paper explores unstructured sparsity in FHE matrix multiplication schemes as a means of reducing this burden while maintaining model accuracy requirements. We demonstrate that sparsity can be exploited in arbitrary matrix multiplication, providing runtime benefits compared to a baseline naive algorithm at all sparsity levels. This is a notable departure from the plaintext domain, where there is a trade-off between sparsity and the overhead of the sparse multiplication algorithm. In addition, we propose three sparse multiplication schemes in FHE based on common plaintext sparse encodings. We demonstrate the performance gain is scheme-invariant; however, some sparse schemes vastly reduce the memory storage requirements of the encrypted matrix at high sparsity values. Our proposed sparse schemes yield an average performance gain of 2.5x at 50% unstructured sparsity, with our multi-threading scheme providing a 32.5x performance increase over the equivalent single-threaded sparse computation when utilizing 64 cores.
