Table of Contents
Fetching ...

Learning to Compare Hardware Designs for High-Level Synthesis

Yunsheng Bai, Atefeh Sohrabizadeh, Zijian Ding, Rongjian Liang, Weikai Li, Ding Wang, Haoxing Ren, Yizhou Sun, Jason Cong

TL;DR

This work tackles the nonlinear and interactively dependent design space of high-level synthesis (HLS) pragmas. It introduces compareXplore, a learning-to-rank framework that combines a graph neural network encoder, a Node Difference Attention module, and a hybrid loss to jointly learn pointwise performance and pairwise design preferences, implemented in a two-stage design space exploration (DSE). The method demonstrates significant improvements in ranking metrics and latency reductions over the state of the art, including an average latency improvement of $16.11\%$ and strong gains on complex kernels like "adi". The approach balances exploration and exploitation and lays groundwork for deeper integration of comparative learning in hardware design workflows, with potential extensions to large language model–assisted design.

Abstract

High-level synthesis (HLS) is an automated design process that transforms high-level code into hardware designs, enabling the rapid development of hardware accelerators. HLS relies on pragmas, which are directives inserted into the source code to guide the synthesis process, and pragmas have various settings and values that significantly impact the resulting hardware design. State-of-the-art ML-based HLS methods, such as HARP, first train a deep learning model, typically based on graph neural networks (GNNs) applied to graph-based representations of the source code and pragmas. They then perform design space exploration (DSE) to explore the pragma design space, rank candidate designs using the model, and return the top designs. However, traditional DSE methods face challenges due to the highly nonlinear relationship between pragma settings and performance metrics, along with complex interactions between pragmas that affect performance in non-obvious ways. To address these challenges, we propose compareXplore, a novel approach that learns to compare hardware designs for effective HLS optimization. CompareXplore introduces a hybrid loss function that combines pairwise preference learning with pointwise performance prediction, enabling the model to capture both relative preferences and absolute performance. Moreover, we introduce a novel node difference attention module that focuses on the most informative differences between designs, enabling the model to identify critical pragmas impacting performance. CompareXplore adopts a two-stage DSE, where a pointwise prediction model is used for the initial design pruning, followed by a pairwise comparison stage for precise performance verification. In extensive experiments, compareXplore achieves significant improvements in ranking metrics and generates high-quality HLS results for the selected designs, outperforming the existing SOTA method.

Learning to Compare Hardware Designs for High-Level Synthesis

TL;DR

This work tackles the nonlinear and interactively dependent design space of high-level synthesis (HLS) pragmas. It introduces compareXplore, a learning-to-rank framework that combines a graph neural network encoder, a Node Difference Attention module, and a hybrid loss to jointly learn pointwise performance and pairwise design preferences, implemented in a two-stage design space exploration (DSE). The method demonstrates significant improvements in ranking metrics and latency reductions over the state of the art, including an average latency improvement of and strong gains on complex kernels like "adi". The approach balances exploration and exploitation and lays groundwork for deeper integration of comparative learning in hardware design workflows, with potential extensions to large language model–assisted design.

Abstract

High-level synthesis (HLS) is an automated design process that transforms high-level code into hardware designs, enabling the rapid development of hardware accelerators. HLS relies on pragmas, which are directives inserted into the source code to guide the synthesis process, and pragmas have various settings and values that significantly impact the resulting hardware design. State-of-the-art ML-based HLS methods, such as HARP, first train a deep learning model, typically based on graph neural networks (GNNs) applied to graph-based representations of the source code and pragmas. They then perform design space exploration (DSE) to explore the pragma design space, rank candidate designs using the model, and return the top designs. However, traditional DSE methods face challenges due to the highly nonlinear relationship between pragma settings and performance metrics, along with complex interactions between pragmas that affect performance in non-obvious ways. To address these challenges, we propose compareXplore, a novel approach that learns to compare hardware designs for effective HLS optimization. CompareXplore introduces a hybrid loss function that combines pairwise preference learning with pointwise performance prediction, enabling the model to capture both relative preferences and absolute performance. Moreover, we introduce a novel node difference attention module that focuses on the most informative differences between designs, enabling the model to identify critical pragmas impacting performance. CompareXplore adopts a two-stage DSE, where a pointwise prediction model is used for the initial design pruning, followed by a pairwise comparison stage for precise performance verification. In extensive experiments, compareXplore achieves significant improvements in ranking metrics and generates high-quality HLS results for the selected designs, outperforming the existing SOTA method.
Paper Structure (21 sections, 8 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 21 sections, 8 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview of compareXplore. The model consists of a GNN encoder, a Node Difference Attention module, and two MLP decoders for pairwise comparison and pointwise prediction tasks. The GNN encoder learns node embeddings by aggregating information from neighboring nodes. The Node Difference Attention module focuses on the most informative differences between node embeddings, computes attention scores based on these differences, and aggregates the embedding differences. The model is used in the two-stage DSE process depicted at the bottom. The major novel components are highlighted in the reddish color.
  • Figure 2: Loss curves for the proposed compareXplore.
  • Figure 3: Latency in terms of cycle count ($\downarrow$) of the final designs selected by the DSE stage. The figure is on the log-scale.
  • Figure 4: As $\alpha$ increases, the model places more emphasis on the pairwise loss compared to the pointwise loss. $\alpha$ varies in $\{0.125, 0.25, 0.5, 1, 2, 4, 8\}$.