LAAFD: LLM-based Agents for Accelerated FPGA Design
Maxim Moraru, Kamalavasan Kamalakkannan, Jered Dominguez-Trujillo, Patrick Diehl, Atanu Barai, Julien Loiseau, Zachary Kent Baker, Howard Pritchard, Galen M Shipman
TL;DR
This work tackles the barrier to FPGA adoption by presenting LAAFD, an agentic workflow that automatically translates general-purpose C++ into latency-optimized Vitis HLS kernels guided by co-simulation and synthesis feedback. The approach comprises translation, validation, and an iterative judge–optimizer loop, enabling aggressive HLS optimizations (pipelining, vectorization, dataflow) across 15 HPC kernels, including stencil workloads. Empirically, LAAFD achieves $99.9\%$ of the geometric-mean performance of hand-tuned implementations and matches SODA for stencil code, while often increasing code readability and reducing domain expertise requirements. The results suggest substantial potential to broaden FPGA acceleration in science and edge computing, with a practical translation cost (~US$50) and clear avenues for extending to full applications and model enhancements.
Abstract
FPGAs offer high performance, low latency, and energy efficiency for accelerated computing, yet adoption in scientific and edge settings is limited by the specialized hardware expertise required. High-level synthesis (HLS) boosts productivity over HDLs, but competitive designs still demand hardware-aware optimizations and careful dataflow design. We introduce LAAFD, an agentic workflow that uses large language models to translate general-purpose C++ into optimized Vitis HLS kernels. LAAFD automates key transfor mations: deep pipelining, vectorization, and dataflow partitioning and closes the loop with HLS co-simulation and synthesis feedback to verify correctness while iteratively improving execution time in cycles. Over a suite of 15 kernels representing common compute patterns in HPC, LAFFD achieves 99.9% geomean performance when compared to the hand tuned baseline for Vitis HLS. For stencil workloads, LAAFD matches the performance of SODA, a state-of-the-art DSL-based HLS code generator for stencil solvers, while yielding more readable kernels. These results suggest LAAFD substantially lowers the expertise barrier to FPGA acceleration without sacrificing efficiency.
