Zero-Shot RTL Code Generation with Attention Sink Augmented Large Language Models

Selim Sandal; Ismail Akturk

Zero-Shot RTL Code Generation with Attention Sink Augmented Large Language Models

Selim Sandal, Ismail Akturk

TL;DR

This work addresses zero-shot RTL code generation from high-level hardware specifications using large language models augmented with an attention sink mechanism. By evaluating multiple instruction-tuned LLMs on a Neural Processing Unit (NPU) RTL case study, the authors compare dense, window, and attention-sink attention, demonstrating that attention sink markedly improves long-sequence code generation. The study reports that with attention sink, $4293$ of $4312$ RTL tokens are correct ($99.56\%$), requiring only $16$ fixes, versus hundreds to thousands of fixes for the other attention schemes, underscoring the method's practicality for automated design generation without RTL-focused fine-tuning. The results suggest a viable path for rapid, large-scale architectural exploration and design automation in hardware using LLMs, with attention mechanisms playing a pivotal role in maintaining specification integrity over extended outputs.

Abstract

The design and optimization of hardware have traditionally been resource-intensive, demanding considerable expertise and dependence on established design automation tools. This paper discusses the possibility of exploiting large language models to streamline the code generation process in hardware design. In contrast to earlier studies, this paper aims to use large language models that accepts high-level design specifications through a single prompt to generate corresponding Register-Transfer Level (RTL) code. The ability to use large language models on RTL code generation not only expedites design iteration cycles but also facilitates the exploration of design spaces that have computational challenges for conventional techniques. Through our evaluation, we demonstrate the shortcoming of existing attention mechanisms, and present the abilities of language models to produce functional, optimized, and industry-standard compliant RTL code when a novel attention mechanism is used. These findings underscore the expanding role of large language models in shaping the future landscape of architectural exploration and automation in hardware design.

Zero-Shot RTL Code Generation with Attention Sink Augmented Large Language Models

TL;DR

RTL tokens are correct (

), requiring only

fixes, versus hundreds to thousands of fixes for the other attention schemes, underscoring the method's practicality for automated design generation without RTL-focused fine-tuning. The results suggest a viable path for rapid, large-scale architectural exploration and design automation in hardware using LLMs, with attention mechanisms playing a pivotal role in maintaining specification integrity over extended outputs.

Abstract

Paper Structure (13 sections, 2 figures, 2 tables)

This paper contains 13 sections, 2 figures, 2 tables.

Introduction
Background
Existing LLMs and their limitations in RTL code generation
Instruction-Tuning and fine-tuning LLMs for RTL code generation
Attention Mechanisms in LLMs
A Case Study: NPU
Evaluation
Evaluation Setup
Results
Prompt Generation
Discussion and Future Work
Related Work
Conclusion

Figures (2)

Figure 1: Block diagram of pipelined NPU datapath.
Figure 2: Success percentage of generated tokens with different attention mechanisms using WizardCoder-Python-34B-V1.0.

Zero-Shot RTL Code Generation with Attention Sink Augmented Large Language Models

TL;DR

Abstract

Zero-Shot RTL Code Generation with Attention Sink Augmented Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (2)