Table of Contents
Fetching ...

SkyEgg: Joint Implementation Selection and Scheduling for Hardware Synthesis using E-graphs

Youwei Xiao, Yuyang Zou, Yun Liang

TL;DR

SkyEgg tackles the fundamental bottleneck in hardware synthesis: decoupled optimization of implementation selection and scheduling. By representing both algebraic transformations and hardware implementations as rewrite rules inside an e-graph, it saturates the design space and optimizes for timing via a MILP formulation or an ASAP heuristic. The approach yields average speedups around 3.1x over Vitis HLS and can handle complex expressions on heterogeneous FPGA blocks like DSP slices while meeting timing constraints. This joint, timing-aware synthesis enables more aggressive exploitation of FPGA resources and scalable hardware generation for a wide range of benchmarks. Overall, SkyEgg demonstrates that unified, e-graph-based design space exploration, coupled with MILP/ASAP solving, can significantly improve hardware performance without prohibitive synthesis costs.

Abstract

Hardware synthesis from high-level descriptions remains fundamentally limited by the sequential optimization of interdependent design decisions. Current methodologies, including state-of-the-art high-level synthesis (HLS) tools, artificially separate implementation selection from scheduling, leading to suboptimal designs that cannot fully exploit modern FPGA heterogeneous architectures. Implementation selection is typically performed by ad-hoc pattern matching on operations, a process that does not consider the impact on scheduling. Subsequently, scheduling algorithms operate on fixed selection solutions with inaccurate delay estimates, which misses critical optimization opportunities from appropriately configured FPGA blocks like DSP slices. We present SkyEgg, a novel hardware synthesis framework that jointly optimizes implementation selection and scheduling using the e-graph data structure. Our key insight is that both algebraic transformations and hardware implementation choices can be uniformly represented as rewrite rules within an e-graph, modeling the complete design space of implementation candidates to be selected and scheduled together. First, SkyEgg constructs an e-graph from the input program. It then applies both algebraic and implementation rewrites through equality saturation. Finally, it formulates the joint optimization as a mixed-integer linear programming (MILP) problem on the saturated e-graph. We provide both exact MILP solving and an efficient ASAP heuristic for scalable synthesis. Our evaluation on benchmarks from diverse applications targeting Xilinx Kintex UltraScale+ FPGAs demonstrates that SkyEgg achieves an average speedup of 3.01x over Vitis HLS, with improvements up to 5.22x for complex expressions.

SkyEgg: Joint Implementation Selection and Scheduling for Hardware Synthesis using E-graphs

TL;DR

SkyEgg tackles the fundamental bottleneck in hardware synthesis: decoupled optimization of implementation selection and scheduling. By representing both algebraic transformations and hardware implementations as rewrite rules inside an e-graph, it saturates the design space and optimizes for timing via a MILP formulation or an ASAP heuristic. The approach yields average speedups around 3.1x over Vitis HLS and can handle complex expressions on heterogeneous FPGA blocks like DSP slices while meeting timing constraints. This joint, timing-aware synthesis enables more aggressive exploitation of FPGA resources and scalable hardware generation for a wide range of benchmarks. Overall, SkyEgg demonstrates that unified, e-graph-based design space exploration, coupled with MILP/ASAP solving, can significantly improve hardware performance without prohibitive synthesis costs.

Abstract

Hardware synthesis from high-level descriptions remains fundamentally limited by the sequential optimization of interdependent design decisions. Current methodologies, including state-of-the-art high-level synthesis (HLS) tools, artificially separate implementation selection from scheduling, leading to suboptimal designs that cannot fully exploit modern FPGA heterogeneous architectures. Implementation selection is typically performed by ad-hoc pattern matching on operations, a process that does not consider the impact on scheduling. Subsequently, scheduling algorithms operate on fixed selection solutions with inaccurate delay estimates, which misses critical optimization opportunities from appropriately configured FPGA blocks like DSP slices. We present SkyEgg, a novel hardware synthesis framework that jointly optimizes implementation selection and scheduling using the e-graph data structure. Our key insight is that both algebraic transformations and hardware implementation choices can be uniformly represented as rewrite rules within an e-graph, modeling the complete design space of implementation candidates to be selected and scheduled together. First, SkyEgg constructs an e-graph from the input program. It then applies both algebraic and implementation rewrites through equality saturation. Finally, it formulates the joint optimization as a mixed-integer linear programming (MILP) problem on the saturated e-graph. We provide both exact MILP solving and an efficient ASAP heuristic for scalable synthesis. Our evaluation on benchmarks from diverse applications targeting Xilinx Kintex UltraScale+ FPGAs demonstrates that SkyEgg achieves an average speedup of 3.01x over Vitis HLS, with improvements up to 5.22x for complex expressions.

Paper Structure

This paper contains 16 sections, 11 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Motivating example: A simple neg-add-mul kernel synthesized with different approaches. (a) Original C code. (b) E-graph example. (c) Vitis HLS produces a conservative 3-cycle solution with sub-optimal selection. (d) Manual configuration achieves 2 cycles by mapping all operations to a single DSP48E2 slice, SkyEgg can discover this solution automatically.
  • Figure 2: The overview of SkyEgg.
  • Figure 3: Example of e-graph construction and equality saturation with implementation modeled.
  • Figure 4: Example of timing analysis across connected e-nodes with different implementations and configurations.
  • Figure 5: The ASAP scheduler.
  • ...and 3 more figures

Theorems & Definitions (4)

  • definition 1: Edge Delay
  • definition 2: Path Delay
  • definition 3: Chain Delay
  • definition 4: Top-k Path Delays