Evaluation of Privacy-aware Support Vector Machine (SVM) Learning using Homomorphic Encryption
William J Buchanan, Hisham Ali
TL;DR
This work investigates privacy-aware machine learning by applying Fully Homomorphic Encryption (FHE) to Support Vector Machines (SVM) using the CKKS scheme implemented in OpenFHE. The authors train plaintext SVM models and perform encrypted inference on the Iris dataset, systematically varying encryption parameters such as the ring dimension $N$, multiplicative depth $D$, scaling $S$, modulus $M$, security level $L$, and batch size $B$. They compare linear and polynomial kernels (SVM-Linear vs SVM-Poly) and find that both offer similar performance, with classification accuracy around 96.7% under encryption, while encrypted inference incurs substantial overhead (roughly 1,000x slower) and runtime grows with $N$ and $D$. The results highlight that ring dimension and modulus size are the primary determinants of FHE performance and point to potential optimizations (packing, bootstrapping, hardware acceleration) to render privacy-preserving ML more practical in real-world settings. Overall, the paper demonstrates that privacy-preserving SVM via CKKS maintains accuracy while exposing clear trade-offs between security/precision and computational efficiency, guiding future optimization efforts in encrypted ML workflows.
Abstract
The requirement for privacy-aware machine learning increases as we continue to use PII (Personally Identifiable Information) within machine training. To overcome these privacy issues, we can apply Fully Homomorphic Encryption (FHE) to encrypt data before it is fed into a machine learning model. This involves creating a homomorphic encryption key pair, and where the associated public key will be used to encrypt the input data, and the private key will decrypt the output. But, there is often a performance hit when we use homomorphic encryption, and so this paper evaluates the performance overhead of using the SVM machine learning technique with the OpenFHE homomorphic encryption library. This uses Python and the scikit-learn library for its implementation. The experiments include a range of variables such as multiplication depth, scale size, first modulus size, security level, batch size, and ring dimension, along with two different SVM models, SVM-Poly and SVM-Linear. Overall, the results show that the two main parameters which affect performance are the ring dimension and the modulus size, and that SVM-Poly and SVM-Linear show similar performance levels.
