Table of Contents
Fetching ...

LEAPS: Topological-Layout-Adaptable Multi-Die FPGA Placement for Super Long Line Minimization

Zhixiong Di, Runzhe Tao, Jing Mai, Lin Chen, Yibo Lin

TL;DR

LEAPS addresses the challenge of minimizing super long lines (SLLs) in multi-die FPGAs by introducing a nested, GPU-accelerated placement framework that continuously optimizes SLLs across global placement, legalization, and detailed placement. It combines an augmented Lagrangian reformulation, a soft floor method to map arbitrary SLR topologies to continuous coordinates, and adaptive wirelength weighting to balance HPWL and SLL counts. The approach achieves average reductions of $43.08\%$ in SLLs and $9.99\%$ in HPWL with a $34.34\times$ runtime improvement over the state-of-the-art on ISPD 2017 benchmarks, while supporting complex topologies and clock-aware optimization. These results demonstrate LEAPS' potential to improve timing, power efficiency, and overall scalability in modern multi-die FPGA design pipelines.

Abstract

Multi-die FPGAs are crucial components in modern computing systems, particularly for high-performance applications such as artificial intelligence and data centers. Super long lines (SLLs) provide interconnections between super logic regions (SLRs) for a multi-die FPGA on a silicon interposer. They have significantly higher delay compared to regular interconnects, which need to be minimized. With the increase in design complexity, the growth of SLLs gives rise to challenges in timing and power closure. Existing placement algorithms focus on optimizing the number of SLLs but often face limitations due to specific topologies of SLRs. Furthermore, they fall short of achieving continuous optimization of SLLs throughout the entire placement process. This highlights the necessity for more advanced and adaptable solutions. In this paper, we propose LEAPS, a comprehensive, systematic, and adaptable multi-die FPGA placement algorithm for SLL minimization. Our contributions are threefold: 1) proposing a high-performance global placement algorithm for multi-die FPGAs that optimizes the number of SLLs while addressing other essential design constraints such as wirelength, routability, and clock routing; 2) introducing a versatile method for more complex SLR topologies of multi-die FPGAs, surpassing the limitations of existing approaches; and 3) executing continuous optimization of SLLs across the whole placement stages, including global placement (GP), legalization (LG), and detailed placement (DP). Experimental results demonstrate the effectiveness of LEAPS in reducing SLLs and enhancing circuit performance. Compared with the most recent state-of-the-art (SOTA) method, LEAPS achieves an average reduction of 43.08% in SLLs and 9.99% in HPWL, while exhibiting a notable 34.34$\times$ improvement in runtime.

LEAPS: Topological-Layout-Adaptable Multi-Die FPGA Placement for Super Long Line Minimization

TL;DR

LEAPS addresses the challenge of minimizing super long lines (SLLs) in multi-die FPGAs by introducing a nested, GPU-accelerated placement framework that continuously optimizes SLLs across global placement, legalization, and detailed placement. It combines an augmented Lagrangian reformulation, a soft floor method to map arbitrary SLR topologies to continuous coordinates, and adaptive wirelength weighting to balance HPWL and SLL counts. The approach achieves average reductions of in SLLs and in HPWL with a runtime improvement over the state-of-the-art on ISPD 2017 benchmarks, while supporting complex topologies and clock-aware optimization. These results demonstrate LEAPS' potential to improve timing, power efficiency, and overall scalability in modern multi-die FPGA design pipelines.

Abstract

Multi-die FPGAs are crucial components in modern computing systems, particularly for high-performance applications such as artificial intelligence and data centers. Super long lines (SLLs) provide interconnections between super logic regions (SLRs) for a multi-die FPGA on a silicon interposer. They have significantly higher delay compared to regular interconnects, which need to be minimized. With the increase in design complexity, the growth of SLLs gives rise to challenges in timing and power closure. Existing placement algorithms focus on optimizing the number of SLLs but often face limitations due to specific topologies of SLRs. Furthermore, they fall short of achieving continuous optimization of SLLs throughout the entire placement process. This highlights the necessity for more advanced and adaptable solutions. In this paper, we propose LEAPS, a comprehensive, systematic, and adaptable multi-die FPGA placement algorithm for SLL minimization. Our contributions are threefold: 1) proposing a high-performance global placement algorithm for multi-die FPGAs that optimizes the number of SLLs while addressing other essential design constraints such as wirelength, routability, and clock routing; 2) introducing a versatile method for more complex SLR topologies of multi-die FPGAs, surpassing the limitations of existing approaches; and 3) executing continuous optimization of SLLs across the whole placement stages, including global placement (GP), legalization (LG), and detailed placement (DP). Experimental results demonstrate the effectiveness of LEAPS in reducing SLLs and enhancing circuit performance. Compared with the most recent state-of-the-art (SOTA) method, LEAPS achieves an average reduction of 43.08% in SLLs and 9.99% in HPWL, while exhibiting a notable 34.34 improvement in runtime.
Paper Structure (37 sections, 28 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 37 sections, 28 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Architectural illustration of Xilinx multi-die FPGA Alveo U250: Demonstrating a $1 \times 4$ SLR topology with central I/O banks and DDR controller IPs, and a right-side Vitis platform for CPU communication. (b) Detailed view of SLR architecture: Partitioned into $2 \times 3$ clock regions and further segmented into multiple half columns. (c) Schematic of a CLB slice: Distinguishing between SLICEL and SLICEM types to highlight asymmetric compatibility.
  • Figure 2: Schematic example of a multi-die FPGA featuring a 2$\times$2 SLR topology with an illustrative SLL calculation for a 3-pin net $n$.
  • Figure 3: Core Techniques in the proposed LEAPS: (1) Nested Optimization Hierarchy: Enhances multi-electrostatic_placementopenparf for multi-objective optimization, focusing on SLL minimization. See Section \ref{['subsubsec.:Nested Optimization Hierarchy']}. (2) Soft Floor Method: Transforms discrete SLR coordinates into continuous models, optimizing wirelength and SLR constraints. Refer to Section \ref{['sec:SoftFloorMethod']}. (3) Wirelength-weighting Optimization: Dynamically adjusts HPWL and SLL trade-offs for improved FPGA placement. Details in Section \ref{['subsec.:Adaptive Wirelength-Weighting-Factor Adjusting']}. (4) SLL-aware Legalization: Adapts direct_legalize to prioritize SLL reduction with concurrent clock constraint management. Further information in Section \ref{['subsec.:Clock- and SLL-aware Legalization & Detailed Placement']}. (5) SLL-aware Detailed Placement: Builds on utplacef, focusing on SLL minimization, clock-awareness, and wirelength optimization. See Section \ref{['subsec.:Clock- and SLL-aware Legalization & Detailed Placement']}.
  • Figure 4: Overview of the proposed LEAPS framework. The framework continuously optimizes the number of SLLs while handling other design objectives during the global placement, legalization, and detailed placement stages. The global placement employs a nested optimization technique to progressively converge and optimize each design objective. Subsequent legalization and detailed placement consider SLL minimization and clock routing constraints while refining the initial placement. Note that "Instance Area Adjustment" and "Carry Chain Alignment Correction"are referenced in elfPlace_Meng and multi-electrostatic_placement, respectively, and are not repeated in this paper. The rest of the contents are described in this work.
  • Figure 5: Visualization of the soft floor method applied to a multi-die FPGA with a $2\times2$ SLR topology: Demonstrating variations with different $\gamma_{\mathcal{S}}$ values.
  • ...and 1 more figures