Table of Contents
Fetching ...

Automated C/C++ Program Repair for High-Level Synthesis via Large Language Models

Kangwei Xu, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Ulf Schlichtmann, Bing Li

TL;DR

This work tackles the challenge of repairing regular C/C++ programs for High-Level Synthesis (HLS) by using an LLM-driven framework with Retrieval-Augmented Generation (RAG) to mitigate hallucinations and integrate with repair scripts. It introduces a five-stage pipeline—preprocessing, RAG-guided repair, bit-width optimization, equivalence verification, and PPA-focused circuit optimization—to automatically produce compilable HLS-C code. Key contributions include a repair-template library, data-range–based bit-width determination using LLMs, a cost-aware joint LLM-script repair strategy, and automated PPA tuning. Experimental results across 24 real-world applications show substantial improvements in repair pass rate, reduced LLM usage costs, and meaningful hardware efficiency gains, demonstrating practical impact for hardware-software co-design.

Abstract

In High-Level Synthesis (HLS), converting a regular C/C++ program into its HLS-compatible counterpart (HLS-C) still requires tremendous manual effort. Various program scripts have been introduced to automate this process. But the resulting codes usually contain many issues that should be manually repaired by developers. Since Large Language Models (LLMs) have the ability to automate code generation, they can also be used for automated program repair in HLS. However, due to the limited training of LLMs considering hardware and software simultaneously, hallucinations may occur during program repair using LLMs, leading to compilation failures. Besides, using LLMs for iterative repair also incurs a high cost. To address these challenges, we propose an LLM-driven program repair framework that takes regular C/C++ code as input and automatically generates its corresponding HLS-C code for synthesis while minimizing human repair effort. To mitigate the hallucinations in LLMs and enhance the prompt quality, a Retrieval-Augmented Generation (RAG) paradigm is introduced to guide the LLMs toward correct repair. In addition, we use LLMs to create a static bit width optimization program to identify the optimized bit widths for variables. Moreover, LLM-driven HLS optimization strategies are introduced to add/tune pragmas in HLS-C programs for circuit optimization. Experimental results demonstrate that the proposed LLM-driven automated framework can achieve much higher repair pass rates in 24 real-world applications compared with the traditional scripts and the direct application of LLMs for program repair.

Automated C/C++ Program Repair for High-Level Synthesis via Large Language Models

TL;DR

This work tackles the challenge of repairing regular C/C++ programs for High-Level Synthesis (HLS) by using an LLM-driven framework with Retrieval-Augmented Generation (RAG) to mitigate hallucinations and integrate with repair scripts. It introduces a five-stage pipeline—preprocessing, RAG-guided repair, bit-width optimization, equivalence verification, and PPA-focused circuit optimization—to automatically produce compilable HLS-C code. Key contributions include a repair-template library, data-range–based bit-width determination using LLMs, a cost-aware joint LLM-script repair strategy, and automated PPA tuning. Experimental results across 24 real-world applications show substantial improvements in repair pass rate, reduced LLM usage costs, and meaningful hardware efficiency gains, demonstrating practical impact for hardware-software co-design.

Abstract

In High-Level Synthesis (HLS), converting a regular C/C++ program into its HLS-compatible counterpart (HLS-C) still requires tremendous manual effort. Various program scripts have been introduced to automate this process. But the resulting codes usually contain many issues that should be manually repaired by developers. Since Large Language Models (LLMs) have the ability to automate code generation, they can also be used for automated program repair in HLS. However, due to the limited training of LLMs considering hardware and software simultaneously, hallucinations may occur during program repair using LLMs, leading to compilation failures. Besides, using LLMs for iterative repair also incurs a high cost. To address these challenges, we propose an LLM-driven program repair framework that takes regular C/C++ code as input and automatically generates its corresponding HLS-C code for synthesis while minimizing human repair effort. To mitigate the hallucinations in LLMs and enhance the prompt quality, a Retrieval-Augmented Generation (RAG) paradigm is introduced to guide the LLMs toward correct repair. In addition, we use LLMs to create a static bit width optimization program to identify the optimized bit widths for variables. Moreover, LLM-driven HLS optimization strategies are introduced to add/tune pragmas in HLS-C programs for circuit optimization. Experimental results demonstrate that the proposed LLM-driven automated framework can achieve much higher repair pass rates in 24 real-world applications compared with the traditional scripts and the direct application of LLMs for program repair.
Paper Structure (10 sections, 13 figures, 2 tables)

This paper contains 10 sections, 13 figures, 2 tables.

Figures (13)

  • Figure 1: Traditional workflow for repairing regular C/C++ programs in HLS.
  • Figure 2: The proposed LLM-driven automatic C/C++ program repair framework for HLS.
  • Figure 3: Example of the LLM using RAG to repair recursion errors.
  • Figure 4: (a) Example of the bit width optimization scheme; (b) Distribution of 1200 samples of the variable ‘$m$' from the real-world BFS task. This example illustrates that the C++-based script optimizes the bit width of the variable ‘$m$', which needs only 9-bit instead of the default 32-bit of the int type.
  • Figure 5: LLM-driven automatic optimization scheme.
  • ...and 8 more figures