Accelerating Detailed Routing Convergence through Offline Reinforcement Learning
Afsara Khan, Austin Rovinski
TL;DR
Problem: detailed routing is slow due to complex design rules. Approach: offline Conservative Q-Learning learns per-iteration cost weights to minimize routing iterations. Key results: 5% average iteration reduction and up to 31% on ISPD19 unseen designs, with runtime speedups up to 3.01x; weights generalize across technologies. Significance: shows learned weight scheduling can accelerate detailed routing and be applied across designs, with intentions to release open-source code.
Abstract
Detailed routing remains one of the most complex and time-consuming steps in modern physical design due to the challenges posed by shrinking feature sizes and stricter design rules. Prior detailed routers achieve state-of-the-art results by leveraging iterative pathfinding algorithms to route each net. However, runtimes are a major issue in detailed routers, as converging to a solution with zero design rule violations (DRVs) can be prohibitively expensive. In this paper, we propose leveraging reinforcement learning (RL) to enable rapid convergence in detailed routing by learning from previous designs. We make the key observation that prior detailed routers statically schedule the cost weights used in their routing algorithms, meaning they do not change in response to the design or technology. By training a conservative Q-learning (CQL) model to dynamically select the routing cost weights which minimize the number of algorithm iterations, we find that our work completes the ISPD19 benchmarks with 1.56x average and up to 3.01x faster runtime than the baseline router while maintaining or improving the DRV count in all cases. We also find that this learning shows signs of generalization across technologies, meaning that learning designs in one technology can translate to improved outcomes in other technologies.
