Table of Contents
Fetching ...

LAAFD: LLM-based Agents for Accelerated FPGA Design

Maxim Moraru, Kamalavasan Kamalakkannan, Jered Dominguez-Trujillo, Patrick Diehl, Atanu Barai, Julien Loiseau, Zachary Kent Baker, Howard Pritchard, Galen M Shipman

TL;DR

This work tackles the barrier to FPGA adoption by presenting LAAFD, an agentic workflow that automatically translates general-purpose C++ into latency-optimized Vitis HLS kernels guided by co-simulation and synthesis feedback. The approach comprises translation, validation, and an iterative judge–optimizer loop, enabling aggressive HLS optimizations (pipelining, vectorization, dataflow) across 15 HPC kernels, including stencil workloads. Empirically, LAAFD achieves $99.9\%$ of the geometric-mean performance of hand-tuned implementations and matches SODA for stencil code, while often increasing code readability and reducing domain expertise requirements. The results suggest substantial potential to broaden FPGA acceleration in science and edge computing, with a practical translation cost (~US$50) and clear avenues for extending to full applications and model enhancements.

Abstract

FPGAs offer high performance, low latency, and energy efficiency for accelerated computing, yet adoption in scientific and edge settings is limited by the specialized hardware expertise required. High-level synthesis (HLS) boosts productivity over HDLs, but competitive designs still demand hardware-aware optimizations and careful dataflow design. We introduce LAAFD, an agentic workflow that uses large language models to translate general-purpose C++ into optimized Vitis HLS kernels. LAAFD automates key transfor mations: deep pipelining, vectorization, and dataflow partitioning and closes the loop with HLS co-simulation and synthesis feedback to verify correctness while iteratively improving execution time in cycles. Over a suite of 15 kernels representing common compute patterns in HPC, LAFFD achieves 99.9% geomean performance when compared to the hand tuned baseline for Vitis HLS. For stencil workloads, LAAFD matches the performance of SODA, a state-of-the-art DSL-based HLS code generator for stencil solvers, while yielding more readable kernels. These results suggest LAAFD substantially lowers the expertise barrier to FPGA acceleration without sacrificing efficiency.

LAAFD: LLM-based Agents for Accelerated FPGA Design

TL;DR

This work tackles the barrier to FPGA adoption by presenting LAAFD, an agentic workflow that automatically translates general-purpose C++ into latency-optimized Vitis HLS kernels guided by co-simulation and synthesis feedback. The approach comprises translation, validation, and an iterative judge–optimizer loop, enabling aggressive HLS optimizations (pipelining, vectorization, dataflow) across 15 HPC kernels, including stencil workloads. Empirically, LAAFD achieves of the geometric-mean performance of hand-tuned implementations and matches SODA for stencil code, while often increasing code readability and reducing domain expertise requirements. The results suggest substantial potential to broaden FPGA acceleration in science and edge computing, with a practical translation cost (~US$50) and clear avenues for extending to full applications and model enhancements.

Abstract

FPGAs offer high performance, low latency, and energy efficiency for accelerated computing, yet adoption in scientific and edge settings is limited by the specialized hardware expertise required. High-level synthesis (HLS) boosts productivity over HDLs, but competitive designs still demand hardware-aware optimizations and careful dataflow design. We introduce LAAFD, an agentic workflow that uses large language models to translate general-purpose C++ into optimized Vitis HLS kernels. LAAFD automates key transfor mations: deep pipelining, vectorization, and dataflow partitioning and closes the loop with HLS co-simulation and synthesis feedback to verify correctness while iteratively improving execution time in cycles. Over a suite of 15 kernels representing common compute patterns in HPC, LAFFD achieves 99.9% geomean performance when compared to the hand tuned baseline for Vitis HLS. For stencil workloads, LAAFD matches the performance of SODA, a state-of-the-art DSL-based HLS code generator for stencil solvers, while yielding more readable kernels. These results suggest LAAFD substantially lowers the expertise barrier to FPGA acceleration without sacrificing efficiency.
Paper Structure (19 sections, 9 figures, 6 tables)

This paper contains 19 sections, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Dataflow in compute loop of stencil 2D (S2D) iteration
  • Figure 2: Simplified vision of the workflow for translating, compiling, validating, and optimizing pure C++ code into C++ HLS for Vitis.
  • Figure 3: Initial C++ kernel provided as input to the workflow.
  • Figure 4: Initial code translation produced by the translator agent.
  • Figure 5: Corrected kernel produced by the compile fixer agent.
  • ...and 4 more figures