An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
Gabriel Rodriguez-Canal, David Katz, Nick Brown
TL;DR
Facing the HPC push for energy-efficient acceleration, this paper delivers the first MLIR-based pipeline that offloads Fortran OpenMP regions to FPGAs via the OpenMP target directive. The pipeline uses the MLIR OpenMP dialect together with an HLS dialect, with Flang generating LLVM-IR and leveraging AMD's HLS backend to produce FPGA bitstreams, showcasing portability across MLIR frontends. The approach benefits from MLIR's composability to reduce bespoke compiler effort while preserving OpenMP data semantics and enabling manual kernel optimizations, e.g., $y = a \times x + y$ for SAXPY and the solution of $A x = b$ for SGESL. Evaluation on SAXPY and SGESL indicates runtime parity with hand-written HLS and comparable resource usage, while FPGA power remains about half that of a CPU core, highlighting productivity gains without sacrificing performance. Overall, the work demonstrates MLIR's viability as a practical, directive-based pathway to FPGA acceleration in HPC.
Abstract
With the slowing of Moore's Law, heterogeneous computing platforms such as Field Programmable Gate Arrays (FPGAs) have gained increasing interest for accelerating HPC workloads. In this work we present, to the best of our knowledge, the first implementation of selective code offloading to FPGAs via the OpenMP target directive within MLIR. Our approach combines the MLIR OpenMP dialect with a High-Level Synthesis (HLS) dialect to provide a portable compilation flow targeting FPGAs. Unlike prior OpenMP FPGA efforts that rely on custom compilers, by contrast we integrate with MLIR and so support any MLIR-compatible front end, demonstrated here with Flang. Building upon a range of existing MLIR building blocks significantly reduces the effort required and demonstrates the composability benefits of the MLIR ecosystem. Our approach supports manual optimisation of offloaded kernels through standard OpenMP directives, and this work establishes a flexible and extensible path for directive-based FPGA acceleration integrated within the MLIR ecosystem.
