Table of Contents
Fetching ...

Preemption-Enhanced Benchmark Suite for FPGAs

Arsalan Ali Malik, John Buchanan, Aydin Aysu

TL;DR

The paper tackles the lack of standardized, open benchmarks for evaluating FPGA preemption in multi-tenant cloud environments. It introduces the first open-source preemption-enabled benchmark suite, comprising two families of accelerators (PL-based and RISC-V processor-based) and a total of $27$ real-world benchmarks with integrated state save/restore mechanisms. The suite enables rigorous, reproducible evaluation of preemption overheads, scheduling policies, and context-switch dynamics across diverse workloads, validated on a Xilinx Zynq-$7000$ platform with PCAP-based control. By providing deployment and extension guidelines, the work lays groundwork for a unified research infrastructure that supports fair comparisons and advances in preemption-aware runtime systems for FPGA ecosystems.

Abstract

Field-Programmable Gate Arrays (FPGAs) have become essential in cloud computing due to their reconfigurability, energy efficiency, and ability to accelerate domain-specific workloads. As FPGA adoption grows, research into task scheduling and preemption techniques has intensified. However, the field lacks a standardized benchmarking framework for consistent and reproducible evaluation. Many existing studies propose innovative scheduling or preemption mechanisms but often rely on proprietary or synthetic benchmarks, limiting generalizability and making comparison difficult. This methodical fragmentation hinders effective evaluation of scheduling strategies and preemption in multi-tenant FPGA environments. This paper presents the first open-source preemption-enabled benchmark suite for evaluating FPGA preemption strategies and testing new scheduling algorithms, without requiring users to create preemption workloads from scratch. The suite includes 27 diverse applications spanning cryptography, AI/ML, computation-intensive workloads, communication systems, and multimedia processing. Each benchmark integrates comprehensive context-saving and restoration mechanisms, facilitating reproducible research and consistent comparisons. Our suite not only simplifies testing FPGA scheduling policies but also benefits OS research by enabling the evaluation of scheduling fairness, resource allocation efficiency, and context-switching performance in multi-tenant FPGA systems, ultimately supporting the development of better operating systems and scheduling policies for FPGA-based environments. We also provide guidelines for adding new benchmarks, enabling future research to expand and refine FPGA preemption and scheduling evaluation.

Preemption-Enhanced Benchmark Suite for FPGAs

TL;DR

The paper tackles the lack of standardized, open benchmarks for evaluating FPGA preemption in multi-tenant cloud environments. It introduces the first open-source preemption-enabled benchmark suite, comprising two families of accelerators (PL-based and RISC-V processor-based) and a total of real-world benchmarks with integrated state save/restore mechanisms. The suite enables rigorous, reproducible evaluation of preemption overheads, scheduling policies, and context-switch dynamics across diverse workloads, validated on a Xilinx Zynq- platform with PCAP-based control. By providing deployment and extension guidelines, the work lays groundwork for a unified research infrastructure that supports fair comparisons and advances in preemption-aware runtime systems for FPGA ecosystems.

Abstract

Field-Programmable Gate Arrays (FPGAs) have become essential in cloud computing due to their reconfigurability, energy efficiency, and ability to accelerate domain-specific workloads. As FPGA adoption grows, research into task scheduling and preemption techniques has intensified. However, the field lacks a standardized benchmarking framework for consistent and reproducible evaluation. Many existing studies propose innovative scheduling or preemption mechanisms but often rely on proprietary or synthetic benchmarks, limiting generalizability and making comparison difficult. This methodical fragmentation hinders effective evaluation of scheduling strategies and preemption in multi-tenant FPGA environments. This paper presents the first open-source preemption-enabled benchmark suite for evaluating FPGA preemption strategies and testing new scheduling algorithms, without requiring users to create preemption workloads from scratch. The suite includes 27 diverse applications spanning cryptography, AI/ML, computation-intensive workloads, communication systems, and multimedia processing. Each benchmark integrates comprehensive context-saving and restoration mechanisms, facilitating reproducible research and consistent comparisons. Our suite not only simplifies testing FPGA scheduling policies but also benefits OS research by enabling the evaluation of scheduling fairness, resource allocation efficiency, and context-switching performance in multi-tenant FPGA systems, ultimately supporting the development of better operating systems and scheduling policies for FPGA-based environments. We also provide guidelines for adding new benchmarks, enabling future research to expand and refine FPGA preemption and scheduling evaluation.

Paper Structure

This paper contains 43 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Temporal distribution of academic publications referencing 'FPGA preemption' and 'FPGA context-switching scheduling' over the past decade, as indexed by Google Scholar StateMovermalik2025epochStopnLookStateMover2CoyoteATC_compiler_1ATC_hwctxsw_2AmorphOSOPTIMUSSTFSNimblock. The data exhibits a clear increase—reflecting the industrial and academic interest—in ongoing research efforts in managing FPGA resources dynamically through the preemption capabilities.
  • Figure 2: The figure displays the latency of $15$ PL-based hardware accelerators with preemption support (which we are releasing as open-source). The X-axis shows the benchmark names, while the Y-axis uses a logarithmic scale to present their corresponding latency. We place the (average) latency values—in clock cycles—above each bar to aid interpretation.
  • Figure 3: FPGA floorplan visualization of two deployment configurations: ⓐ a two-slot design featuring PL-based hardware accelerators placed over reconfigurable regions spanning clock regions X0Y1 and X1Y1; and ⓑ a single-slot design hosting a RISC-V–based hardware accelerator distributed across clock regions X0Y0 and X1Y0.
  • Figure 4: The latency of $12$ RISC-V processor-based hardware accelerators with preemption support (which we are making open-source). The X-axis shows the benchmark names, while the Y-axis uses a logarithmic scale to present their corresponding latency. We place the (average) latency values—in clock cycles—above each bar to aid interpretation.