PrefixGPT: Prefix Adder Optimization by a Generative Pre-trained Transformer
Ruogu Ding, Xin Ning, Ulf Schlichtmann, Weikang Qian
TL;DR
PrefixGPT reframes prefix-adder optimization as direct sequence generation using a 2D coordinate representation of the prefix graph, enforced by a dynamic legality mask to ensure designs are valid by construction. A decoder-only Transformer with spatial RoPE embeddings and a two-head architecture learns the prefix-adder grammar through large-scale pre-training on valid sequences and RL-based fine-tuning with an ADP objective, augmented by best-design retrieval. The approach achieves state-of-the-art area-delay product (ADP) performance and markedly lower variance across initializations, including a 7.7% ADP improvement at 48 bits and up to 79.1% lower mean ADP than baselines, while maintaining fast generation times. These results suggest that GPT-style models can master complex hardware design spaces and generate high-quality, valid circuits without post hoc repairs, offering a scalable path for automated EDA tooling.
Abstract
Prefix adders are widely used in compute-intensive applications for their high speed. However, designing optimized prefix adders is challenging due to strict design rules and an exponentially large design space. We introduce PrefixGPT, a generative pre-trained Transformer (GPT) that directly generates optimized prefix adders from scratch. Our approach represents an adder's topology as a two-dimensional coordinate sequence and applies a legality mask during generation, ensuring every design is valid by construction. PrefixGPT features a customized decoder-only Transformer architecture. The model is first pre-trained on a corpus of randomly synthesized valid prefix adders to learn design rules and then fine-tuned to navigate the design space for optimized design quality. Compared with existing works, PrefixGPT not only finds a new optimal design with a 7.7% improved area-delay product (ADP) but exhibits superior exploration quality, lowering the average ADP by up to 79.1%. This demonstrates the potential of GPT-style models to first master complex hardware design principles and then apply them for more efficient design optimization.
