Table of Contents
Fetching ...

Retrieval-Guided Reinforcement Learning for Boolean Circuit Minimization

Animesh Basak Chowdhury, Marco Romanelli, Benjamin Tan, Ramesh Karri, Siddharth Garg

TL;DR

This work tackles logic-synthesis optimization where the quality depends on the synthesis recipe. It introduces ABC-RL, a retrieval-guided reinforcement-learning framework that adaptively blends a pre-trained policy with online MCTS via a tunable parameter $α$, based on how similar the new netlist is to training data. By using nearest-neighbor retrieval on GNN embeddings to compute $δ_{G_0}$ and a sigmoid-based mapping to $α$, ABC-RL consistently outperforms state-of-the-art ML methods and pure MCTS across MCNC and EPFL benchmarks, achieving up to $24.8\%$ QoR gains and up to $9\times$ iso-QoR speed-ups. The approach demonstrates robust improvements and highlights the potential of retrieval-guided RL to address distribution shift in design spaces, with practical impact for faster and higher-quality logic-synthesis in hardware design.

Abstract

Logic synthesis, a pivotal stage in chip design, entails optimizing chip specifications encoded in hardware description languages like Verilog into highly efficient implementations using Boolean logic gates. The process involves a sequential application of logic minimization heuristics (``synthesis recipe"), with their arrangement significantly impacting crucial metrics such as area and delay. Addressing the challenge posed by the broad spectrum of design complexities - from variations of past designs (e.g., adders and multipliers) to entirely novel configurations (e.g., innovative processor instructions) - requires a nuanced `synthesis recipe` guided by human expertise and intuition. This study conducts a thorough examination of learning and search techniques for logic synthesis, unearthing a surprising revelation: pre-trained agents, when confronted with entirely novel designs, may veer off course, detrimentally affecting the search trajectory. We present ABC-RL, a meticulously tuned $α$ parameter that adeptly adjusts recommendations from pre-trained agents during the search process. Computed based on similarity scores through nearest neighbor retrieval from the training dataset, ABC-RL yields superior synthesis recipes tailored for a wide array of hardware designs. Our findings showcase substantial enhancements in the Quality-of-result (QoR) of synthesized circuits, boasting improvements of up to 24.8% compared to state-of-the-art techniques. Furthermore, ABC-RL achieves an impressive up to 9x reduction in runtime (iso-QoR) when compared to current state-of-the-art methodologies.

Retrieval-Guided Reinforcement Learning for Boolean Circuit Minimization

TL;DR

This work tackles logic-synthesis optimization where the quality depends on the synthesis recipe. It introduces ABC-RL, a retrieval-guided reinforcement-learning framework that adaptively blends a pre-trained policy with online MCTS via a tunable parameter , based on how similar the new netlist is to training data. By using nearest-neighbor retrieval on GNN embeddings to compute and a sigmoid-based mapping to , ABC-RL consistently outperforms state-of-the-art ML methods and pure MCTS across MCNC and EPFL benchmarks, achieving up to QoR gains and up to iso-QoR speed-ups. The approach demonstrates robust improvements and highlights the potential of retrieval-guided RL to address distribution shift in design spaces, with practical impact for faster and higher-quality logic-synthesis in hardware design.

Abstract

Logic synthesis, a pivotal stage in chip design, entails optimizing chip specifications encoded in hardware description languages like Verilog into highly efficient implementations using Boolean logic gates. The process involves a sequential application of logic minimization heuristics (``synthesis recipe"), with their arrangement significantly impacting crucial metrics such as area and delay. Addressing the challenge posed by the broad spectrum of design complexities - from variations of past designs (e.g., adders and multipliers) to entirely novel configurations (e.g., innovative processor instructions) - requires a nuanced `synthesis recipe` guided by human expertise and intuition. This study conducts a thorough examination of learning and search techniques for logic synthesis, unearthing a surprising revelation: pre-trained agents, when confronted with entirely novel designs, may veer off course, detrimentally affecting the search trajectory. We present ABC-RL, a meticulously tuned parameter that adeptly adjusts recommendations from pre-trained agents during the search process. Computed based on similarity scores through nearest neighbor retrieval from the training dataset, ABC-RL yields superior synthesis recipes tailored for a wide array of hardware designs. Our findings showcase substantial enhancements in the Quality-of-result (QoR) of synthesized circuits, boasting improvements of up to 24.8% compared to state-of-the-art techniques. Furthermore, ABC-RL achieves an impressive up to 9x reduction in runtime (iso-QoR) when compared to current state-of-the-art methodologies.
Paper Structure (40 sections, 8 equations, 12 figures, 10 tables, 1 algorithm)

This paper contains 40 sections, 8 equations, 12 figures, 10 tables, 1 algorithm.

Figures (12)

  • Figure 1: (Left) A hardware design in Verilog is first transformed into an AIG, i.e., a netlist containing only AND and NOT gates. Then a sequence of functionality-preserving transformations (here, picked from set {rw, rwz, …, b }) is applied to generate an optimized AIG. Each such sequence is called a synthesis recipe. The synthesis recipe with the best QoR (e.g., area or delay) is shown in green. (Right) Applying rw and b to an AIG results results in an AIG with fewer nodes and lower depth.
  • Figure 1: Training, validation and test splits in our experiments. Netlists from each benchmark are represented in each split. In the test set, MCNC netlists are relabeled [C1-C12], EPFL-arith to [A1-A4] and EPFL-control to [R1-R4].
  • Figure 3: Policy network architecture. GCN: Graph convolution network, BN: Batch normalization, FC: Fully connected layer
  • Figure 4: ABC-RL flow: Training the agent (left), setting temperature $T$ and threshold $\delta_{th}$ (mid) and Recipe generation at inference-time (right)
  • Figure 5: Area-delay product reduction (in %) compared to resyn2 on MCNC circuits.
  • ...and 7 more figures