Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design

Yuchao Liao; Tosiron Adegbija; Roman Lysecky

Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design

Yuchao Liao, Tosiron Adegbija, Roman Lysecky

TL;DR

The paper tackles the challenge of accelerating post-quantum cryptography (PQC) implementations, focusing on the FALCON digital signature scheme on FPGAs. It proposes an LLM-driven hardware-software co-design framework that analyzes PQC algorithms to identify bottlenecks, partitions workloads between hardware and software, and generates HDL for FPGA synthesis, integrating with standard toolchains. Empirical results show that LLM-generated accelerators can achieve up to $2.6\times$ kernel speedups over conventional HLS, with an average kernel speedup of $2.0$? Actually $1.78\times$ over HLS and $1.78\times$ over software baseline, and a $1.15\times$ improvement over HLS on average, along with longer LUT/FF usage but reduced dynamic power. The work demonstrates that LLM-informed partitioning and HDL generation can reduce design effort and development time for PQC accelerators on FPGAs, while noting the need for formal constant-time verification and systematic end-to-end evaluation as future work.

Abstract

Post-quantum cryptography (PQC) is crucial for securing data against emerging quantum threats. However, its algorithms are computationally complex and difficult to implement efficiently on hardware. In this paper, we explore the potential of Large Language Models (LLMs) to accelerate the hardware-software co-design process for PQC, with a focus on the FALCON digital signature scheme. We present a novel framework that leverages LLMs to analyze PQC algorithms, identify performance-critical components, and generate candidate hardware descriptions for FPGA implementation. We present the first quantitative comparison between LLM-driven synthesis and conventional HLS-based approaches for low-level compute-intensive kernels in FALCON, showing that human-in-the-loop LLM-generated accelerators can achieve up to 2.6x speedup in kernel execution time with shorter critical paths, while highlighting trade-offs in resource utilization and power consumption. Our results suggest that LLMs can minimize design effort and development time by automating FPGA accelerator design iterations for PQC algorithms, offering a promising new direction for rapid and adaptive PQC accelerator design on FPGAs.

Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design

TL;DR

kernel speedups over conventional HLS, with an average kernel speedup of

? Actually

over HLS and

over software baseline, and a

improvement over HLS on average, along with longer LUT/FF usage but reduced dynamic power. The work demonstrates that LLM-informed partitioning and HDL generation can reduce design effort and development time for PQC accelerators on FPGAs, while noting the need for formal constant-time verification and systematic end-to-end evaluation as future work.

Abstract

Paper Structure (14 sections, 3 figures, 3 tables)

This paper contains 14 sections, 3 figures, 3 tables.

Introduction
Background
FALCON and hardware acceleration
High-Level synthesis and LLM-assisted hardware design
Methodology
Hardware-software partitioning
LLM-driven hardware generation
Experiments
Experimental setup
Results and analysis
Hardware/software partitioning
Accelerator generation and comparison to prior work
Security and Constant-Time Considerations
Conclusion

Figures (3)

Figure 1: Overview of the proposed LLM-driven PQC acceleration flow, integrating profiling, partitioning, and Verilog generation within FPGA design tools for rapid HW–SW co-design.
Figure 2: Using LLM to rank functions in the PQC's key generation path.
Figure 3: Using LLM to accelerate modp_montymul function with fully pipelined design.

Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design

TL;DR

Abstract

Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design

Authors

TL;DR

Abstract

Table of Contents

Figures (3)