A Haskell to FHE Transpiler
Anne Müller, Mohd Kashif, Nico Döttling
TL;DR
The paper addresses the bottleneck of applying Fully Homomorphic Encryption by enabling high-level programming through a Haskell-to-FHE transpiler built with Clash, producing Boolean circuits that TFHE can evaluate. It adds a layer-based parallelization strategy to exploit circuit depth, demonstrating competitive performance on AES-128 and Private Information Retrieval benchmarks. The key contributions are the first Haskell (via Clash) to FHE compiler, and the demonstration that simple layer-wise parallelism yields strong speedups, with publicly available benchmarks. This work broadens FHE accessibility for functional programmers and provides a practical pathway toward scalable encrypted computation in real-world workloads.
Abstract
Fully Homomorphic Encryption (FHE) enables the evaluation of programs directly on encrypted data. However, because only basic operations can be performed on ciphertexts, programs must be expressed as boolean or arithmetic circuits. This low-level representation makes implementing applications for FHE significantly more cumbersome than writing code in a high-level language. To reduce this burden, several transpilers have been developed that translate high-level code into circuit representations. In this work, we extend the range of high-level languages that can target FHE by introducing a transpiler for Haskell, which converts Haskell programs into Boolean circuits suitable for homomorphic evaluation. Our second contribution is the automatic parallelization of these generated circuits. We implement an evaluator that executes gates in parallel by parallelizing each layer of the circuit. We demonstrate the effectiveness of our approach on two key applications: Private Information Retrieval (PIR) and the AES encryption standard. Prior work has parallelized AES encryption manually. We demonstrate that the automated method outperforms some but not all manual parallelizations of AES evaluations under FHE. We achieve an evaluation time of 28 seconds for a parallel execution with 16 threads and an evaluation time of 8 seconds for a parallel execution with 100 threads
