Table of Contents
Fetching ...

Algorithm-Driven On-Chip Integration for High Density and Low Cost

Jeongeun Kim, Sabrina Yarzada, Paul Chen, Christopher Torng

TL;DR

This work addresses the need for scalable, low-cost silicon prototyping by proposing Chipstitch, an integrated framework for active chip site aggregation. It combines algorithmic packing on a grid-based template space, a fixed-area interconnect architecture that fits between packing results, and perimeter power-shutdown domains to enable bench-level power characterization without low-power design expertise. The approach achieves substantial area reductions (up to 13x) compared with state-of-the-art physical-only methods and demonstrates practicality with a 25-site Intel 16 nm prototype, highlighting potential for nation-scale educational and research tapeouts. Collectively, the paper lays the groundwork for scalable, education-friendly chip-tapeout environments by unifying algorithmic, architectural, and VLSI design considerations.

Abstract

Growing interest in semiconductor workforce development has generated demand for platforms capable of supporting large numbers of independent hardware designs for research and training without imposing high per-project overhead. Traditional multi-project wafer (MPW) services based solely on physical co-placement have historically met this need, yet their scalability breaks down as project counts rise. Recent efforts towards scalable chip tapeouts mitigate these limitations by integrating many small designs within a shared die and attempt to amortize costly resources such as IO pads and memory macros. However, foundational principles for arranging, linking, and validating such densely integrated design sites have received limited systematic investigation. This work presents a new approach with three key techniques to address this gap. First, we establish a structured formulation of the design space that enables automated, algorithm-driven packing of many projects, replacing manual layout practices. Second, we introduce an architecture that exploits only the narrow-area regions between sites to deliver on off-chip communication and other shared needs. Third, we provide a practical approach for on-chip power domains enabling per-project power characterization at a standard laboratory bench and requiring no expertise in low-power ASIC design. Experimental results show that our approach achieves substantial area reductions of up to 13x over state-of-the-art physical-only aggregation methods, offering a scalable and cost-effective path forward for large-scale tapeout environments.

Algorithm-Driven On-Chip Integration for High Density and Low Cost

TL;DR

This work addresses the need for scalable, low-cost silicon prototyping by proposing Chipstitch, an integrated framework for active chip site aggregation. It combines algorithmic packing on a grid-based template space, a fixed-area interconnect architecture that fits between packing results, and perimeter power-shutdown domains to enable bench-level power characterization without low-power design expertise. The approach achieves substantial area reductions (up to 13x) compared with state-of-the-art physical-only methods and demonstrates practicality with a 25-site Intel 16 nm prototype, highlighting potential for nation-scale educational and research tapeouts. Collectively, the paper lays the groundwork for scalable, education-friendly chip-tapeout environments by unifying algorithmic, architectural, and VLSI design considerations.

Abstract

Growing interest in semiconductor workforce development has generated demand for platforms capable of supporting large numbers of independent hardware designs for research and training without imposing high per-project overhead. Traditional multi-project wafer (MPW) services based solely on physical co-placement have historically met this need, yet their scalability breaks down as project counts rise. Recent efforts towards scalable chip tapeouts mitigate these limitations by integrating many small designs within a shared die and attempt to amortize costly resources such as IO pads and memory macros. However, foundational principles for arranging, linking, and validating such densely integrated design sites have received limited systematic investigation. This work presents a new approach with three key techniques to address this gap. First, we establish a structured formulation of the design space that enables automated, algorithm-driven packing of many projects, replacing manual layout practices. Second, we introduce an architecture that exploits only the narrow-area regions between sites to deliver on off-chip communication and other shared needs. Third, we provide a practical approach for on-chip power domains enabling per-project power characterization at a standard laboratory bench and requiring no expertise in low-power ASIC design. Experimental results show that our approach achieves substantial area reductions of up to 13x over state-of-the-art physical-only aggregation methods, offering a scalable and cost-effective path forward for large-scale tapeout environments.

Paper Structure

This paper contains 15 sections, 17 figures, 2 tables.

Figures (17)

  • Figure 1: Recent trends towards cost-efficient silicon prototyping at scale are motivating a shift towards active chip site aggregation. (a) Classic multi-project wafer (MPW) dicing problem kahng-mpw-ispd2004; (b) Google and Efabless efabless-webefabless-caravel-web Caravel wrapper in open-source Skywater 130nm technology skywater130-web; (c) Active chip site aggregation with logical, active, on-die architecture, with less silicon waste but little existing literature.
  • Figure 2: An example floorplan for the active chip site aggregation problem, showing five projects on a single die with an on-die, active logical interconnect to share the IO pads and control logic.
  • Figure 3: A modified stochastic solver for the active chip site packing problem (2D bin-packing, NP-hard) operating on a templated, grid-based chip design canvas to guarantee manufacturability and functionality (i.e., DRC, LVS). Three iteration snapshots show how chip sites are progressively packed into a smaller bounding box. The start point is red (global controller), and blocks are colored increasingly darker according to connection sequence. The bounding box is drawn with yellow dashed lines, and red dots on each chip site indicate port directions.
  • Figure 4: A technology-specific active chip site packing grid reduces the complexity of manufacturability design rule checks to a small, manageable subset, enabling algorithmic active chip site packing solutions to be deployed. Each color represents an instance of a particular chip site template.
  • Figure 5: Floorplan optimization via simulated annealing and iterative routing. The algorithm performs up to $M_p$ placement attempts (maximum number of simulated annealing-based placements), and for each placement, up to $M_r$ routing retries (maximum routing attempts to verify pin connectivity). Chip site blocks $C$ are iteratively placed using the placer's simulated annealing method, and a bounding box $bbox$ is computed to evaluate layout area as the optimization metric. The router attempts to route pins between the placed blocks, and if routing succeeds and improves upon previous solutions, the current layout is recorded as the best solution. The procedure returns the best successfully routed floorplan or failure if none is found.
  • ...and 12 more figures