Table of Contents
Fetching ...

FLEX: Leveraging FPGA-CPU Synergy for Mixed-Cell-Height Legalization Acceleration

Xingyu Liu, Jiawei Liang, Linfeng Du, Yipu Zhang, Chaofang Ma, Hanwei Fan, Jiang Xu, Wei Zhang

TL;DR

The paper tackles the computational bottleneck of mixed-cell-height legalization (MGL) in VLSI design. It introduces FLEX, a CPU-FPGA co-design that partitions tasks, leverages a multi-granularity FOP pipeline, and uses Sort-Ahead Cell Shifting (SACS) to accelerate the most irregular step. Key contributions include an efficient task-assignment strategy that minimizes data movement, a streaming-enabled FOP pipeline, and FPGA-optimized SACS with bandwidth-aware memory architecture. Empirically, FLEX achieves up to 18.3x speedups over CPU-GPU and 5.4x over multi-threaded CPU legalizers, while also improving legalization quality in dense designs, demonstrating the practicality of FPGA-based acceleration for irregular EDA workloads.

Abstract

In this work, we present FLEX, an FPGA-CPU accelerator for mixed-cell-height legalization tasks. We address challenges from the following perspectives. First, we optimize the task assignment strategy and perform an efficient task partition between FPGA and CPU to exploit their complementary strengths. Second, a multi-granularity pipelining technique is employed to accelerate the most time-consuming step, finding optimal placement position (FOP), in legalization. At last, we particularly target the computationally intensive cell shifting process in FOP, optimizing the design to align it seamlessly with the multi-granularity pipelining framework for further speedup. Experimental results show that FLEX achieves up to 18.3x and 5.4x speedups compared to state-of-the-art CPU-GPU and multi-threaded CPU legalizers with better scalability, while improving legalization quality by 4% and 1%.

FLEX: Leveraging FPGA-CPU Synergy for Mixed-Cell-Height Legalization Acceleration

TL;DR

The paper tackles the computational bottleneck of mixed-cell-height legalization (MGL) in VLSI design. It introduces FLEX, a CPU-FPGA co-design that partitions tasks, leverages a multi-granularity FOP pipeline, and uses Sort-Ahead Cell Shifting (SACS) to accelerate the most irregular step. Key contributions include an efficient task-assignment strategy that minimizes data movement, a streaming-enabled FOP pipeline, and FPGA-optimized SACS with bandwidth-aware memory architecture. Empirically, FLEX achieves up to 18.3x speedups over CPU-GPU and 5.4x over multi-threaded CPU legalizers, while also improving legalization quality in dense designs, demonstrating the practicality of FPGA-based acceleration for irregular EDA workloads.

Abstract

In this work, we present FLEX, an FPGA-CPU accelerator for mixed-cell-height legalization tasks. We address challenges from the following perspectives. First, we optimize the task assignment strategy and perform an efficient task partition between FPGA and CPU to exploit their complementary strengths. Second, a multi-granularity pipelining technique is employed to accelerate the most time-consuming step, finding optimal placement position (FOP), in legalization. At last, we particularly target the computationally intensive cell shifting process in FOP, optimizing the design to align it seamlessly with the multi-granularity pipelining framework for further speedup. Experimental results show that FLEX achieves up to 18.3x and 5.4x speedups compared to state-of-the-art CPU-GPU and multi-threaded CPU legalizers with better scalability, while improving legalization quality by 4% and 1%.

Paper Structure

This paper contains 28 sections, 2 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Example of layouts before and after legalization.
  • Figure 2: Overview of the challenges and contributions.
  • Figure 4: Architecture overview of FLEX.
  • Figure 5: Comparison of original and optimized FOP algorithms.
  • Figure 6: Comparison of the original cell shifting algorithm and the sort-ahead cell shifting algorithm.
  • ...and 4 more figures