Efficient Task Transfer for HLS DSE

Zijian Ding; Atefeh Sohrabizadeh; Weikai Li; Zongyue Qin; Yizhou Sun; Jason Cong

Efficient Task Transfer for HLS DSE

Zijian Ding, Atefeh Sohrabizadeh, Weikai Li, Zongyue Qin, Yizhou Sun, Jason Cong

TL;DR

The paper tackles the challenge of transferring high-level synthesis design space exploration (DSE) strategies across evolving toolchains, where QoR and validity labels shift. It introduces Active-CEM, a model-based explorer that combines a discrete design-space sampler with a surrogate reward model and active learning to efficiently search across toolchains. A novel toolchain-invariant embedding is proposed to separate shared representations from toolchain-specific components, enabling robust cross-domain predictions. Empirical results on the HLSyn benchmark show substantial improvements in design performance and sample efficiency over AutoDSE and HARP, and demonstrate the method's potential for domain transfer and scalable HLS-based accelerator design.

Abstract

There have been several recent works proposed to utilize model-based optimization methods to improve the productivity of using high-level synthesis (HLS) to design domain-specific architectures. They would replace the time-consuming performance estimation or simulation of design with a proxy model, and automatically insert pragmas to guide hardware optimizations. In this work, we address the challenges associated with high-level synthesis (HLS) design space exploration (DSE) through the evolving landscape of HLS tools. As these tools develop, the quality of results (QoR) from synthesis can vary significantly, complicating the maintenance of optimal design strategies across different toolchains. We introduce Active-CEM, a task transfer learning scheme that leverages a model-based explorer designed to adapt efficiently to changes in toolchains. This approach optimizes sample efficiency by identifying high-quality design configurations under a new toolchain without requiring extensive re-evaluation. We further refine our methodology by incorporating toolchain-invariant modeling. This allows us to predict QoR changes more accurately despite shifts in the black-box implementation of the toolchains. Experiment results on the HLSyn benchmark transitioning to new toolchain show an average performance improvement of 1.58$\times$ compared to AutoDSE and a 1.2$\times$ improvement over HARP, while also increasing the sample efficiency by 5.26$\times$, and reducing the runtime by 2.7$\times$.

Efficient Task Transfer for HLS DSE

TL;DR

Abstract

compared to AutoDSE and a 1.2

improvement over HARP, while also increasing the sample efficiency by 5.26

, and reducing the runtime by 2.7

Paper Structure (27 sections, 2 equations, 7 figures, 12 tables, 2 algorithms)

This paper contains 27 sections, 2 equations, 7 figures, 12 tables, 2 algorithms.

Introduction
Preliminaries
HLS Design Space and Pragmas
HARP
Methodology
Toolchain-invariant modeling
DS Sampler
Optimization with Cross-Entropy Method
Model-based CEM
Selective Labeling for Model Update
Active-CEM
Sample efficiency and runtime of the algorithm
Design Space Pruning
Evaluation
Experiment setup
...and 12 more sections

Figures (7)

Figure 1: The distance of the best design found by AutoDSE sohrabizadeh2022autodse between different toolchains: The horizontal axis represents different programs, and the vertical axis denotes the distance between designs. V20: Vitis HLS 2020.2, V21: Vitis HLS 2021.1, V23: Vitis HLS 2023.2. The distance is calculated by summing up the discrete code difference between two designs over each pragma.
Figure 2: The label shift when changing the toolchain: the performance/resource labels are plotted on the X/Y axis for the old/new toolchain, a larger distance from the $Y=X$ line indicate larger label shift.
Figure 3: Overview: The DS Sampler and the Reward Model interact with each other through active learning
Figure 4: Model architecture with invariant embedding
Figure 5: Normalized geomean performance of CEM and other optimization algorithms: The Gen-$K$ and the CEM-$K$ represent the different population size used in the genetic search and CEM
...and 2 more figures

Efficient Task Transfer for HLS DSE

TL;DR

Abstract

Efficient Task Transfer for HLS DSE

Authors

TL;DR

Abstract

Table of Contents

Figures (7)