Table of Contents
Fetching ...

PlaceIT: Placement-based Inter-Chiplet Interconnect Topologies

Patrick Iff, Benigna Bruggmann, Maciej Besta, Luca Benini, Torsten Hoefler

TL;DR

PlaceIT introduces a novel framework to jointly optimize chiplet placement and inter-chiplet interconnect topologies in 2.5D stacks by inferring a placement-based topology for each placement. The method uses proxies for latency and throughput to guide optimization, enabling topology co-optimization across homogeneous and heterogeneous chiplets, including passive interposers and silicon bridges. It demonstrates substantial improvements over a 2D mesh baseline—up to 28% latency reduction for L1-to-L2 and 62% for L2-to-memory, with average packet-latency reductions around 18% on traces—while maintaining modest area overhead. The open-source implementation supports multiple optimization algorithms and placement representations, offering a flexible tool for designing low-latency, high-throughput 2.5D interconnects in diverse packaging technologies.

Abstract

2.5D integration technology is gaining traction as it copes with the exponentially growing design cost of modern integrated circuits. A crucial part of a 2.5D stacked chip is a low-latency and high-throughput inter-chiplet interconnect (ICI). Two major factors affecting the latency and throughput are the topology of links between chiplets and the chiplet placement. In this work, we present PlaceIT, a novel methodology to jointly optimize the ICI topology and the chiplet placement. While state-of-the-art methods optimize the chiplet placement for a predetermined ICI topology, or they select one topology out of a set of candidates, we generate a completely new topology for each placement. Our process of inferring placement-based ICI topologies connects chiplets that are in close proximity to each other, making it particularly attractive for chips with silicon bridges or passive silicon interposers with severely limited link lengths. We provide an open-source implementation of our method that optimizes the placement of homogeneously or heterogeneously shaped chiplets and the ICI topology connecting them for a user-defined mix of four different traffic types. We evaluate our methodology using synthetic traffic and traces, and we compare our results to a 2D mesh baseline. PlaceIT reduces the latency of synthetic L1-to-L2 and L2-to-memory traffic, the two most important types for cache coherency traffic, by up to 28% and 62%, respectively. It also achieve an average packet latency reduction of up to 18% on traffic traces. PlaceIT enables the construction of 2.5D stacked chips with low-latency ICIs.

PlaceIT: Placement-based Inter-Chiplet Interconnect Topologies

TL;DR

PlaceIT introduces a novel framework to jointly optimize chiplet placement and inter-chiplet interconnect topologies in 2.5D stacks by inferring a placement-based topology for each placement. The method uses proxies for latency and throughput to guide optimization, enabling topology co-optimization across homogeneous and heterogeneous chiplets, including passive interposers and silicon bridges. It demonstrates substantial improvements over a 2D mesh baseline—up to 28% latency reduction for L1-to-L2 and 62% for L2-to-memory, with average packet-latency reductions around 18% on traces—while maintaining modest area overhead. The open-source implementation supports multiple optimization algorithms and placement representations, offering a flexible tool for designing low-latency, high-throughput 2.5D interconnects in diverse packaging technologies.

Abstract

2.5D integration technology is gaining traction as it copes with the exponentially growing design cost of modern integrated circuits. A crucial part of a 2.5D stacked chip is a low-latency and high-throughput inter-chiplet interconnect (ICI). Two major factors affecting the latency and throughput are the topology of links between chiplets and the chiplet placement. In this work, we present PlaceIT, a novel methodology to jointly optimize the ICI topology and the chiplet placement. While state-of-the-art methods optimize the chiplet placement for a predetermined ICI topology, or they select one topology out of a set of candidates, we generate a completely new topology for each placement. Our process of inferring placement-based ICI topologies connects chiplets that are in close proximity to each other, making it particularly attractive for chips with silicon bridges or passive silicon interposers with severely limited link lengths. We provide an open-source implementation of our method that optimizes the placement of homogeneously or heterogeneously shaped chiplets and the ICI topology connecting them for a user-defined mix of four different traffic types. We evaluate our methodology using synthetic traffic and traces, and we compare our results to a 2D mesh baseline. PlaceIT reduces the latency of synthetic L1-to-L2 and L2-to-memory traffic, the two most important types for cache coherency traffic, by up to 28% and 62%, respectively. It also achieve an average packet latency reduction of up to 18% on traffic traces. PlaceIT enables the construction of 2.5D stacked chips with low-latency ICIs.

Paper Structure

This paper contains 29 sections, 18 figures, 7 tables.

Figures (18)

  • Figure 1: (§ \ref{['ssec:back-25D']}) 2.5D integration technologies (side view). We show a core-to-core link (red) and an off-chip link (purple).
  • Figure 2: (§\ref{['sec:coopt']}) Placement and topology co-optimization.
  • Figure 3: (§ \ref{['sec:arch']}) Overview of the PlaceIT architecture.
  • Figure 4: (§ \ref{['ssec:arch-cost']}) Correlation of cost value with its components. Points corresponds to random designs with colors indicating the placement's cost. Red circles highlight the lowest-cost placement. Throughput is given in percent of the theoretical peak.
  • Figure 5: (§ \ref{['ssec:homo-repr']}) Homogeneous placement representation.(b) is a mutation of (a), (c) and (d) show the process of merging (a) and (b), (e) extracts the network of (d).
  • ...and 13 more figures