Table of Contents
Fetching ...

NetTAG: A Multimodal RTL-and-Layout-Aligned Netlist Foundation Model via Text-Attributed Graph

Wenji Fang, Wenkai Li, Shang Liu, Yao Lu, Hongce Zhang, Zhiyao Xie

TL;DR

NetTAG addresses the limitation of purely graph-based netlist encoders by introducing a text-attributed graph representation that fuses gate semantics with circuit topology. It uses an LLM-based gate encoder (ExprLLM) and a graph transformer (TAGFormer), with cross-stage alignment to RTL and layout data, learned via self-supervised objectives. Across four functional and physical tasks, NetTAG achieves superior accuracy and prediction quality compared to task-specific baselines and state-of-the-art AIG encoders, demonstrating strong generalization and scalability. This foundation model enables versatile, multi-granularity netlist embeddings suitable for rapid adaptation to diverse EDA challenges and downstream tasks.

Abstract

Circuit representation learning has shown promise in advancing Electronic Design Automation (EDA) by capturing structural and functional circuit properties for various tasks. Existing pre-trained solutions rely on graph learning with complex functional supervision, such as truth table simulation. However, they only handle simple and-inverter graphs (AIGs), struggling to fully encode other complex gate functionalities. While large language models (LLMs) excel at functional understanding, they lack the structural awareness for flattened netlists. To advance netlist representation learning, we present NetTAG, a netlist foundation model that fuses gate semantics with graph structure, handling diverse gate types and supporting a variety of functional and physical tasks. Moving beyond existing graph-only methods, NetTAG formulates netlists as text-attributed graphs, with gates annotated by symbolic logic expressions and physical characteristics as text attributes. Its multimodal architecture combines an LLM-based text encoder for gate semantics and a graph transformer for global structure. Pre-trained with gate and graph self-supervised objectives and aligned with RTL and layout stages, NetTAG captures comprehensive circuit intrinsics. Experimental results show that NetTAG consistently outperforms each task-specific method on four largely different functional and physical tasks and surpasses state-of-the-art AIG encoders, demonstrating its versatility.

NetTAG: A Multimodal RTL-and-Layout-Aligned Netlist Foundation Model via Text-Attributed Graph

TL;DR

NetTAG addresses the limitation of purely graph-based netlist encoders by introducing a text-attributed graph representation that fuses gate semantics with circuit topology. It uses an LLM-based gate encoder (ExprLLM) and a graph transformer (TAGFormer), with cross-stage alignment to RTL and layout data, learned via self-supervised objectives. Across four functional and physical tasks, NetTAG achieves superior accuracy and prediction quality compared to task-specific baselines and state-of-the-art AIG encoders, demonstrating strong generalization and scalability. This foundation model enables versatile, multi-granularity netlist embeddings suitable for rapid adaptation to diverse EDA challenges and downstream tasks.

Abstract

Circuit representation learning has shown promise in advancing Electronic Design Automation (EDA) by capturing structural and functional circuit properties for various tasks. Existing pre-trained solutions rely on graph learning with complex functional supervision, such as truth table simulation. However, they only handle simple and-inverter graphs (AIGs), struggling to fully encode other complex gate functionalities. While large language models (LLMs) excel at functional understanding, they lack the structural awareness for flattened netlists. To advance netlist representation learning, we present NetTAG, a netlist foundation model that fuses gate semantics with graph structure, handling diverse gate types and supporting a variety of functional and physical tasks. Moving beyond existing graph-only methods, NetTAG formulates netlists as text-attributed graphs, with gates annotated by symbolic logic expressions and physical characteristics as text attributes. Its multimodal architecture combines an LLM-based text encoder for gate semantics and a graph transformer for global structure. Pre-trained with gate and graph self-supervised objectives and aligned with RTL and layout stages, NetTAG captures comprehensive circuit intrinsics. Experimental results show that NetTAG consistently outperforms each task-specific method on four largely different functional and physical tasks and surpasses state-of-the-art AIG encoders, demonstrating its versatility.

Paper Structure

This paper contains 21 sections, 8 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Overvew of NetTAG. Netlists are formulated as text-attributed graphs, with functional and physical text attributes extracted for each gate. Within NetTAG, gate attributes are initially encoded by ExprLLM, then refined with global netlist graph structures using TAGFormer. NetTAG is pre-trained with self-supervised objectives and aligned with RTL and layout embeddings, enabling versatile support for both functional and physical tasks after fine-tuning.
  • Figure 2: NetTAG Workflow. Sequential netlists are chunked into combinational register cones and converted into TAGs. During pre-training, NetTAG is trained with node-level and graph-level self-supervised objectives, and it is aligned with RTL and layout embeddings. The pre-trained NetTAG then generates netlist embeddings, which are fine-tuned with netlist-stage task labels.
  • Figure 3: Circuit data design stages and modalities. RTL code is processed directly as text. Netlists are represented as TAGs, with node attributes including gate name, type, symbolic expression, and physical property. Symbolic expressions are derived from each gate’s k-hop input cone. Layout data is converted into graphs and annotated with physical information extracted from the SPEF file.
  • Figure 4: NetTAG architecture and pre-training workflow. In step 1, we collect a dataset of gate symbolic expressions and pre-train ExprLLM to enhance its understanding of Boolean formulas. In step 2, ExprLLM is frozen, and we pre-train TAGFormer to fuse the gate semantics with the global graph structure. NetTAG embeddings are cross-stage aligned with those from pre-trained auxiliary RTL and layout encoders.
  • Figure 5: Comparision with pre-trained AIG encoders on AIG dataset.
  • ...and 3 more figures