Table of Contents
Fetching ...

Timing-Driven Global Placement by Efficient Critical Path Extraction

Yunqi Shi, Siyuan Xu, Shixiong Kai, Xi Lin, Ke Xue, Mingxuan Yuan, Chao Qian

TL;DR

This paper tackles timing-driven global placement for modern ICs, where path-based timing information is essential but difficult to scale. It presents a GPU-accelerated framework that extends DREAMPlace with a fine-grained pin-to-pin attraction objective and an efficient critical-path extraction module, coupled with a quadratic RC-aligned distance loss. The method achieves sizable improvements on ICCAD2015 benchmarks, notably reducing total negative slack by up to around 50% and improving worst negative slack while maintaining or improving HPWL compared to baselines. This work enables faster timing closure with path-aware optimization, offering a scalable path toward high-performance, timing-constrained placement.

Abstract

Timing optimization during the global placement of integrated circuits has been a significant focus for decades, yet it remains a complex, unresolved issue. Recent analytical methods typically use pin-level timing information to adjust net weights, which is fast and simple but neglects the path-based nature of the timing graph. The existing path-based methods, however, cannot balance the accuracy and efficiency due to the exponential growth of number of critical paths. In this work, we propose a GPU-accelerated timing-driven global placement framework, integrating accurate path-level information into the efficient DREAMPlace infrastructure. It optimizes the fine-grained pin-to-pin attraction objective and is facilitated by efficient critical path extraction. We also design a quadratic distance loss function specifically to align with the RC timing model. Experimental results demonstrate that our method significantly outperforms the current leading timing-driven placers, achieving an average improvement of 40.5% in total negative slack (TNS) and 8.3% in worst negative slack (WNS), as well as an improvement in half-perimeter wirelength (HPWL).

Timing-Driven Global Placement by Efficient Critical Path Extraction

TL;DR

This paper tackles timing-driven global placement for modern ICs, where path-based timing information is essential but difficult to scale. It presents a GPU-accelerated framework that extends DREAMPlace with a fine-grained pin-to-pin attraction objective and an efficient critical-path extraction module, coupled with a quadratic RC-aligned distance loss. The method achieves sizable improvements on ICCAD2015 benchmarks, notably reducing total negative slack by up to around 50% and improving worst negative slack while maintaining or improving HPWL compared to baselines. This work enables faster timing closure with path-aware optimization, offering a scalable path toward high-performance, timing-constrained placement.

Abstract

Timing optimization during the global placement of integrated circuits has been a significant focus for decades, yet it remains a complex, unresolved issue. Recent analytical methods typically use pin-level timing information to adjust net weights, which is fast and simple but neglects the path-based nature of the timing graph. The existing path-based methods, however, cannot balance the accuracy and efficiency due to the exponential growth of number of critical paths. In this work, we propose a GPU-accelerated timing-driven global placement framework, integrating accurate path-level information into the efficient DREAMPlace infrastructure. It optimizes the fine-grained pin-to-pin attraction objective and is facilitated by efficient critical path extraction. We also design a quadratic distance loss function specifically to align with the RC timing model. Experimental results demonstrate that our method significantly outperforms the current leading timing-driven placers, achieving an average improvement of 40.5% in total negative slack (TNS) and 8.3% in worst negative slack (WNS), as well as an improvement in half-perimeter wirelength (HPWL).

Paper Structure

This paper contains 16 sections, 10 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Our timing-driven placement flow enabling GPU-acceleration. Gradients in orange are propagated on GPU.
  • Figure 2: Illustration of traditional net weighting and pin-to-pin attraction.
  • Figure 3: Visualization of a specific critical path optimized using different distance losses. The slack of each path is given on the top of each figure.
  • Figure 4: Runtime breakdown comparison between DREAMPlace 4.0 and our method for case superblue1. The time spent by each component is normalized by 615 seconds, the total runtime of DREAMPlace 4.0.
  • Figure 5: Optimization iterations for case superblue1. The blue curve is DREAMPlace 4.0, and the yellow one is our method. Timing optimization of both methods starts from the 500th iteration. TNS and WNS values are converted to their absolute values in the figure for better illustration.