Table of Contents
Fetching ...

R-HLS: An IR for Dynamic High-Level Synthesis and Memory Disambiguation based on Regions and State Edges

David Metz, Nico Reissmann, Magnus Själander

TL;DR

R-HLS presents a novel RVSDG-based dialect for dynamic high-level synthesis that explicitly models control flow, routing, and memory through a global data-flow representation. By leveraging memory state edges and introducing distributed memory disambiguation via per-operation address management, it exposes finer-grained parallelism and enables out-of-program-order execution of memory accesses. The approach yields a 10% average speedup over the prior state-of-the-art and substantial hardware resource reductions, notably in LUTs and flip-flops, at the cost of more complex buffering strategies. These findings demonstrate the practical viability of region-based data-flow IRs for dynamic HLS and offer a path toward scalable, memory-aware synthesis for irregular workloads.

Abstract

Dynamically scheduled hardware enables high-level synthesis (HLS) for applications with irregular control flow and latencies, which perform poorly with conventional statically scheduled approaches. Since dynamically scheduled hardware is inherently data flow based, it is beneficial to have an intermediate representation (IR) that captures the global data flow to enable easier transformations. State-of-the-art dynamic HLS utilize control flow based IRs, which model data flow only at the basic block level, requiring the rediscovery of inter-block parallelism. The Regionalized Value State Dependence Graph (RVSDG) is an IR that models (1) control flow as part of the global data flow utilizing regions and (2) memory dependencies using state edges. We propose R-HLS, a new RVSDG dialect targeted for dynamic high-level synthesis. R-HLS explicitly models control flow decisions, routing, and memory, which are only abstractly represented in the RVSDG. Expressing the control flow as part of the data flow reduces the need for complex optimizations to extract performance and enables easy conversion to parallel circuits. Furthermore, we present a distributed memory disambiguation optimization that leverages memory state edges to decouple address generation from data accesses, resulting in resource efficient out-of-program-order execution of memory operations. Our results show that R-HLS effectively exposes parallelism, resulting in fewer executed cycles and a 10% speedup on average, compared to the state-of-the-art in dynamic HLS with optimized memory disambiguation. These results are achieved with a significant reduction in resource utilization, such as a 79% reduction in lookup-tables and 22% reduction in flip-flops, on average.

R-HLS: An IR for Dynamic High-Level Synthesis and Memory Disambiguation based on Regions and State Edges

TL;DR

R-HLS presents a novel RVSDG-based dialect for dynamic high-level synthesis that explicitly models control flow, routing, and memory through a global data-flow representation. By leveraging memory state edges and introducing distributed memory disambiguation via per-operation address management, it exposes finer-grained parallelism and enables out-of-program-order execution of memory accesses. The approach yields a 10% average speedup over the prior state-of-the-art and substantial hardware resource reductions, notably in LUTs and flip-flops, at the cost of more complex buffering strategies. These findings demonstrate the practical viability of region-based data-flow IRs for dynamic HLS and offer a path toward scalable, memory-aware synthesis for irregular workloads.

Abstract

Dynamically scheduled hardware enables high-level synthesis (HLS) for applications with irregular control flow and latencies, which perform poorly with conventional statically scheduled approaches. Since dynamically scheduled hardware is inherently data flow based, it is beneficial to have an intermediate representation (IR) that captures the global data flow to enable easier transformations. State-of-the-art dynamic HLS utilize control flow based IRs, which model data flow only at the basic block level, requiring the rediscovery of inter-block parallelism. The Regionalized Value State Dependence Graph (RVSDG) is an IR that models (1) control flow as part of the global data flow utilizing regions and (2) memory dependencies using state edges. We propose R-HLS, a new RVSDG dialect targeted for dynamic high-level synthesis. R-HLS explicitly models control flow decisions, routing, and memory, which are only abstractly represented in the RVSDG. Expressing the control flow as part of the data flow reduces the need for complex optimizations to extract performance and enables easy conversion to parallel circuits. Furthermore, we present a distributed memory disambiguation optimization that leverages memory state edges to decouple address generation from data accesses, resulting in resource efficient out-of-program-order execution of memory operations. Our results show that R-HLS effectively exposes parallelism, resulting in fewer executed cycles and a 10% speedup on average, compared to the state-of-the-art in dynamic HLS with optimized memory disambiguation. These results are achieved with a significant reduction in resource utilization, such as a 79% reduction in lookup-tables and 22% reduction in flip-flops, on average.
Paper Structure (17 sections, 6 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: Lowering gamma node without state edges or theta
  • Figure 2: Lowering of gamma node with state edges
  • Figure 3: Lowering of theta node
  • Figure 4: Memory operation conversion
  • Figure 5: Memory ordering mechanisms
  • ...and 1 more figures