LAAFD: LLM-based Agents for Accelerated FPGA Design

Maxim Moraru; Kamalavasan Kamalakkannan; Jered Dominguez-Trujillo; Patrick Diehl; Atanu Barai; Julien Loiseau; Zachary Kent Baker; Howard Pritchard; Galen M Shipman

LAAFD: LLM-based Agents for Accelerated FPGA Design

Maxim Moraru, Kamalavasan Kamalakkannan, Jered Dominguez-Trujillo, Patrick Diehl, Atanu Barai, Julien Loiseau, Zachary Kent Baker, Howard Pritchard, Galen M Shipman

TL;DR

This work tackles the barrier to FPGA adoption by presenting LAAFD, an agentic workflow that automatically translates general-purpose C++ into latency-optimized Vitis HLS kernels guided by co-simulation and synthesis feedback. The approach comprises translation, validation, and an iterative judge–optimizer loop, enabling aggressive HLS optimizations (pipelining, vectorization, dataflow) across 15 HPC kernels, including stencil workloads. Empirically, LAAFD achieves $99.9\%$ of the geometric-mean performance of hand-tuned implementations and matches SODA for stencil code, while often increasing code readability and reducing domain expertise requirements. The results suggest substantial potential to broaden FPGA acceleration in science and edge computing, with a practical translation cost (~US$50) and clear avenues for extending to full applications and model enhancements.

Abstract

FPGAs offer high performance, low latency, and energy efficiency for accelerated computing, yet adoption in scientific and edge settings is limited by the specialized hardware expertise required. High-level synthesis (HLS) boosts productivity over HDLs, but competitive designs still demand hardware-aware optimizations and careful dataflow design. We introduce LAAFD, an agentic workflow that uses large language models to translate general-purpose C++ into optimized Vitis HLS kernels. LAAFD automates key transfor mations: deep pipelining, vectorization, and dataflow partitioning and closes the loop with HLS co-simulation and synthesis feedback to verify correctness while iteratively improving execution time in cycles. Over a suite of 15 kernels representing common compute patterns in HPC, LAFFD achieves 99.9% geomean performance when compared to the hand tuned baseline for Vitis HLS. For stencil workloads, LAAFD matches the performance of SODA, a state-of-the-art DSL-based HLS code generator for stencil solvers, while yielding more readable kernels. These results suggest LAAFD substantially lowers the expertise barrier to FPGA acceleration without sacrificing efficiency.

LAAFD: LLM-based Agents for Accelerated FPGA Design

TL;DR

of the geometric-mean performance of hand-tuned implementations and matches SODA for stencil code, while often increasing code readability and reducing domain expertise requirements. The results suggest substantial potential to broaden FPGA acceleration in science and edge computing, with a practical translation cost (~US$50) and clear avenues for extending to full applications and model enhancements.

Abstract

Paper Structure (19 sections, 9 figures, 6 tables)

This paper contains 19 sections, 9 figures, 6 tables.

Introduction
Background
High level synthesis (HLS)
High Level RTL Generation Frameworks
Related Work
Methodology
Agentic Workflow
Translation
Validation
Optimization
Illustrative Example
Kernels and HLS Optimizations
Evaluation
Execution cycles
Consumed FPGA resources
...and 4 more sections

Figures (9)

Figure 1: Dataflow in compute loop of stencil 2D (S2D) iteration
Figure 2: Simplified vision of the workflow for translating, compiling, validating, and optimizing pure C++ code into C++ HLS for Vitis.
Figure 3: Initial C++ kernel provided as input to the workflow.
Figure 4: Initial code translation produced by the translator agent.
Figure 5: Corrected kernel produced by the compile fixer agent.
...and 4 more figures

LAAFD: LLM-based Agents for Accelerated FPGA Design

TL;DR

Abstract

LAAFD: LLM-based Agents for Accelerated FPGA Design

Authors

TL;DR

Abstract

Table of Contents

Figures (9)