Large Reasoning Models for 3D Floorplanning in EDA: Learning from Imperfections

Fin Amin; Nirjhor Rouf; Tse-Han Pan; Md Kamal Ibn Shafi; Paul D. Franzon

Large Reasoning Models for 3D Floorplanning in EDA: Learning from Imperfections

Fin Amin, Nirjhor Rouf, Tse-Han Pan, Md Kamal Ibn Shafi, Paul D. Franzon

TL;DR

This work introduces Dreamweaver, a large reasoning model (LRM) for 3D IC floorplanning that operates over a structured large discrete action space using continuous 3D action embeddings coupled with k-NN to select valid placements. Trained offline on randomly generated trajectories and guided by return-to-go signals, it leverages an actor-critic transformer architecture to optimize multi-objective metrics like wirelength, congestion, and thermals, achieving superior wirelength compared with the prior state-of-the-art Chipformer on MCNC benchmarks. The approach halves reliance on expert trajectories, demonstrates strong generalization, and decouples the output dimension from the canvas size, offering a path toward more scalable and cost-efficient IC floorplanning with practical limitations discussed for future industrial deployment.

Abstract

In this paper, we introduce Dreamweaver, which belongs to a new class of auto-regressive decision-making models known as large reasoning models (LRMs). Dreamweaver is designed to improve 3D floorplanning in electronic design automation (EDA) via an architecture that melds advancements in sequence-to-sequence reinforcement learning algorithms. A significant advantage of our approach is its ability to effectively reason over large discrete action spaces, which is essential for handling the numerous potential positions for various functional blocks in floorplanning. Additionally, Dreamweaver demonstrates strong performance even when trained on entirely random trajectories, showcasing its capacity to leverage sub-optimal or non-expert trajectories to enhance its results. This innovative approach contributes to streamlining the integrated circuit (IC) design flow and reducing the high computational costs typically associated with floorplanning. We evaluate its performance against a current state-of-the-art method, highlighting notable improvements.

Large Reasoning Models for 3D Floorplanning in EDA: Learning from Imperfections

TL;DR

Abstract

Paper Structure (16 sections, 10 equations, 4 figures, 2 tables)

This paper contains 16 sections, 10 equations, 4 figures, 2 tables.

Introduction and Motivation
Background
ML Solutions to Floorplanning
RL and Decision Transformers
Reasoning in Structured Large Discrete Action Spaces
Large Reasoning Models
Dreamweaver Architecture
Training: Learning from Random Floorplans
Experiments
RL Environment
Evaluations
Results and Discussion
Appendix
Model and Experiment Parameters
Limitations
...and 1 more sections

Figures (4)

Figure 1: Our agent melds recent advancements in auto-regressive decision models with the Wolpertinger architecture. This figure shows the flow of information across a single time-step of inferencing. First, the SARs ($\gamma_t$) are input into the actor so that the actor can propose a candidate centroid location, $\alpha = \hat{x},\hat{y},\hat{z}$, then, $k$-NN is used to find the $k$ nearest legal actions to $\alpha$. Finally, the critic produces $\hat{g_t}$, the return to go for the remainder of the trajectory for each of the $k$ actions. The action which produces the best $\hat{g_t}$ is ultimately selected in accordance with equation \ref{['eq:critic_argmax']}.
Figure 2: The shape and positions of the input embeddings as they are placed in the $token$$embeddings$. We also add a time-wise positional embedding to the entire token (not shown). Note that we differ from existing work by considering this entire $d_{model}$-sized token embedding for a single timestep. Other work typically discretizes states, actions, rewards into sub-timesteps.
Figure 3: Our environment returns three proxy metrics at each timestep, $t$: wirelength (WL), congestion and thermals. This chart shows the Dreamweaver critic's error in estimating the RTG with respect to $\gamma_t$. Error bars show variance over the trajectories.
Figure 4: The following shows 3-layer floorplans generated from the ami33 and ami49 netlists. The total Manhattan wirelength ($\downarrow$) of Dreamweaver is 200,738 and 28,636 as opposed to 225,877 and 39,442 of Chipformer for the aforementioned netlists, respectively. The lower wirelength for ami49 can be explained by the latter having significantly fewer connections compared to ami33. Note that Chipformer was trained offline and fine-tuned online whereas Dreamweaver was trained offline.

Large Reasoning Models for 3D Floorplanning in EDA: Learning from Imperfections

TL;DR

Abstract

Large Reasoning Models for 3D Floorplanning in EDA: Learning from Imperfections

Authors

TL;DR

Abstract

Table of Contents

Figures (4)