Table of Contents
Fetching ...

PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling

Ruizhe Zhong, Junjie Ye, Zhentao Tang, Shixiong Kai, Mingxuan Yuan, Jianye Hao, Junchi Yan

TL;DR

This work proposes global circuit training to pre-train a graph auto-encoder that learns the global graph embedding from circuit netlist and uses a novel node updating scheme for message passing on GCN, following the topological sorting sequence of the learned graph embedding and circuit graph.

Abstract

Pre-routing timing prediction has been recently studied for evaluating the quality of a candidate cell placement in chip design. It involves directly estimating the timing metrics for both pin-level (slack, slew) and edge-level (net delay, cell delay), without time-consuming routing. However, it often suffers from signal decay and error accumulation due to the long timing paths in large-scale industrial circuits. To address these challenges, we propose a two-stage approach. First, we propose global circuit training to pre-train a graph auto-encoder that learns the global graph embedding from circuit netlist. Second, we use a novel node updating scheme for message passing on GCN, following the topological sorting sequence of the learned graph embedding and circuit graph. This scheme residually models the local time delay between two adjacent pins in the updating sequence, and extracts the lookup table information inside each cell via a new attention mechanism. To handle large-scale circuits efficiently, we introduce an order preserving partition scheme that reduces memory consumption while maintaining the topological dependencies. Experiments on 21 real world circuits achieve a new SOTA R2 of 0.93 for slack prediction, which is significantly surpasses 0.59 by previous SOTA method. Code will be available at: https://github.com/Thinklab-SJTU/EDA-AI.

PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling

TL;DR

This work proposes global circuit training to pre-train a graph auto-encoder that learns the global graph embedding from circuit netlist and uses a novel node updating scheme for message passing on GCN, following the topological sorting sequence of the learned graph embedding and circuit graph.

Abstract

Pre-routing timing prediction has been recently studied for evaluating the quality of a candidate cell placement in chip design. It involves directly estimating the timing metrics for both pin-level (slack, slew) and edge-level (net delay, cell delay), without time-consuming routing. However, it often suffers from signal decay and error accumulation due to the long timing paths in large-scale industrial circuits. To address these challenges, we propose a two-stage approach. First, we propose global circuit training to pre-train a graph auto-encoder that learns the global graph embedding from circuit netlist. Second, we use a novel node updating scheme for message passing on GCN, following the topological sorting sequence of the learned graph embedding and circuit graph. This scheme residually models the local time delay between two adjacent pins in the updating sequence, and extracts the lookup table information inside each cell via a new attention mechanism. To handle large-scale circuits efficiently, we introduce an order preserving partition scheme that reduces memory consumption while maintaining the topological dependencies. Experiments on 21 real world circuits achieve a new SOTA R2 of 0.93 for slack prediction, which is significantly surpasses 0.59 by previous SOTA method. Code will be available at: https://github.com/Thinklab-SJTU/EDA-AI.
Paper Structure (16 sections, 11 equations, 8 figures, 10 tables, 1 algorithm)

This paper contains 16 sections, 11 equations, 8 figures, 10 tables, 1 algorithm.

Figures (8)

  • Figure 1: MSE of AT prediction v.s. topological level. In previous ML-based methods TimingGCN guo2022timing, MSE increases rapidly with level, showing the existence of signal decay and error accumulation.
  • Figure 2: Pipeline of our approach. Circuit graph is represented as a heterogeneous DAG. We first implement training an auto-encoder by global circuit reconstruction in a self-supervised way. After global circuit pre-training, we drop the decoder, and freeze/fine-tune the encoder to map each circuit graph to a low-dimensional latent space. The latent vector is treated as the global graph embedding and concatenates to original node features. Finally, heterogeneous circuit DAG is feed forward to the our PreRoutGNN for timing prediction.
  • Figure 3: Multi-head Joint Attention (MJA) for cell modeling. To model this querying-indexing-interpolation process, we model it with a joint attention.
  • Figure 4: Illustration for order preserving graph partition. Padding nodes are not involved in loss computing and BP.
  • Figure 5: GPU memory cost and training time comparison between training on whole graphs and partitioned graphs. Number in brackets indicates the maximum sub-graph size.
  • ...and 3 more figures