Annotating Slack Directly on Your Verilog: Fine-Grained RTL Timing Evaluation for Early Optimization
Wenji Fang, Shang Liu, Hongce Zhang, Zhiyao Xie
TL;DR
RTL-Timer addresses the lack of fine-grained timing information at the RTL stage by introducing a novel, four-representation, register-endpoint-centric timing estimator that predicts per-endpoint slack and overall design timing. The core ideas combine a universal bit-level representation (BOG) with ensemble learning across representations, a path-aware loss that uses the slowest and sampled paths, and both bit-wise and signal-wise modeling to predict $AT_{\mathcal{N}}(ep)$ and $Rank_{\mathcal{N}}(ep)$. Key contributions include the first general fine-grained RTL timing predictor applicable to unknown designs, automatic slack annotation on HDL, and optimization guidance for synthesis via group_path and retiming, yielding average improvements of $9.9\%$ in TNS and $3.1\%$ in WNS, with performance maintained after placement and post-placement steps. The method demonstrates high correlation with ground truth at the RTL stage ($R\approx0.89$ for fine-grained slack and $R\approx0.98$ for TNS; $R\approx0.91$ for WNS) and offers practical impact by enabling early, RTL-guided optimizations that translate into tangible timing benefits downstream.
Abstract
In digital IC design, compared with post-synthesis netlists or layouts, the early register-transfer level (RTL) stage offers greater optimization flexibility for both designers and EDA tools. However, timing information is typically unavailable at this early stage. Some recent machine learning (ML) solutions propose to predict the total negative slack (TNS) and worst negative slack (WNS) of an entire design at the RTL stage, but the fine-grained timing information of individual registers remains unavailable. In this work, we address the unique challenges of RTL timing prediction and introduce our solution named RTL-Timer. To the best of our knowledge, this is the first fine-grained general timing estimator applicable to any given design. RTL-Timer explores multiple promising RTL representations and proposes customized loss functions to capture the maximum arrival time at register endpoints. RTL-Timer's fine-grained predictions are further applied to guide optimization in a standard synthesis flow. The average results on unknown test designs demonstrate a correlation above 0.89, contributing around 3% WNS and 10% TNS improvement after optimization.
