Cement2: Temporal Hardware Transactions for High-Level and Efficient FPGA Programming
Youwei Xiao, Zizhang Luo, Weijie Peng, Yuyang Zou, Yun Liang
TL;DR
Cement2 tackles the challenge of raising hardware design abstraction without sacrificing cycle accuracy by introducing temporal hardware transactions, a timing-aware extension to transactional HDLs. Implemented as the Cement2 framework with a Rust frontend (CMT2-rs) and CTIR, it provides inter-cycle analysis, multi-cycle rules, and a multi-phase synthesis flow that yields efficient RTL for FPGA. The approach supports both intra-cycle and multi-cycle behaviors, enabling precise timing coordination and retiming, and it demonstrates competitive performance and hardware quality across RISC-V, custom instructions, linear algebra kernels, and systolic arrays. The results suggest wide applicability for general FPGA programming, offering productivity gains while maintaining fine-grained control over timing and resources.
Abstract
Hardware design faces a fundamental challenge: raising abstraction to improve productivity while maintaining control over low-level details like cycle accuracy. Traditional RTL design in languages like SystemVerilog composes modules through wiring-style connections that provide weak guarantees for behavioral correctness. While high-level synthesis (HLS) and emerging abstractions attempt to address this, they either introduce unpredictable overhead or restrict design generality. Although transactional HDLs provide a promising foundation by lifting design abstraction to atomic and composable rules, they solely model intra-cycle behavior and do not reflect the native temporal design characteristics, hindering applicability and productivity for FPGA programming scenarios. We propose temporal hardware transactions, a new abstraction that brings cycle-level timing awareness to designers at the transactional language level. Our approach models temporal relationships between rules and supports the description of rules whose actions span multiple clock cycles, providing intuitive abstraction to describe multi-cycle architectural behavior. We implement this in Cement2, a transactional HDL embedded in Rust, enabling programming hardware constructors to build both intra-cycle and temporal transactions. Cement2's synthesis framework lowers description abstraction through multiple analysis and optimization phases, generating efficient hardware. With Cement2's abstraction, we program a RISC-V soft-core processor, custom CPU instructions, linear algebra kernels, and systolic array accelerators, leveraging the high-level abstraction for boosted productivity. Evaluation shows that Cement2 does not sacrifice performance and resources compared to hand-coded RTL designs, demonstrating the high applicability for general FPGA design tasks.
