Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design
Yuchao Liao, Tosiron Adegbija, Roman Lysecky
TL;DR
The paper tackles the challenge of accelerating post-quantum cryptography (PQC) implementations, focusing on the FALCON digital signature scheme on FPGAs. It proposes an LLM-driven hardware-software co-design framework that analyzes PQC algorithms to identify bottlenecks, partitions workloads between hardware and software, and generates HDL for FPGA synthesis, integrating with standard toolchains. Empirical results show that LLM-generated accelerators can achieve up to $2.6\times$ kernel speedups over conventional HLS, with an average kernel speedup of $2.0$? Actually $1.78\times$ over HLS and $1.78\times$ over software baseline, and a $1.15\times$ improvement over HLS on average, along with longer LUT/FF usage but reduced dynamic power. The work demonstrates that LLM-informed partitioning and HDL generation can reduce design effort and development time for PQC accelerators on FPGAs, while noting the need for formal constant-time verification and systematic end-to-end evaluation as future work.
Abstract
Post-quantum cryptography (PQC) is crucial for securing data against emerging quantum threats. However, its algorithms are computationally complex and difficult to implement efficiently on hardware. In this paper, we explore the potential of Large Language Models (LLMs) to accelerate the hardware-software co-design process for PQC, with a focus on the FALCON digital signature scheme. We present a novel framework that leverages LLMs to analyze PQC algorithms, identify performance-critical components, and generate candidate hardware descriptions for FPGA implementation. We present the first quantitative comparison between LLM-driven synthesis and conventional HLS-based approaches for low-level compute-intensive kernels in FALCON, showing that human-in-the-loop LLM-generated accelerators can achieve up to 2.6x speedup in kernel execution time with shorter critical paths, while highlighting trade-offs in resource utilization and power consumption. Our results suggest that LLMs can minimize design effort and development time by automating FPGA accelerator design iterations for PQC algorithms, offering a promising new direction for rapid and adaptive PQC accelerator design on FPGAs.
