Toward Open-Source Chiplets for HPC and AI: Occamy and Beyond
Paul Scheffler, Thomas Benz, Tim Fischer, Lorenzo Leone, Sina Arjmandpour, Luca Benini
TL;DR
This work addresses the challenge of bridging the performance gap between open-source chiplet designs and proprietary HPC/AI silicon by proposing a concrete roadmap built around open 2.5D RISC-V manycores. Starting with Occamy, a silicon-proven dual-chiplet system in 12 nm with 432 cores and a hierarchical crossbar, the authors demonstrate baseline compute density and identify interconnect limitations. They then scale to Ramora with a scalable 2D mesh NoC, achieving a 1.29T DPS peak and higher bandwidth utilization, and finally conceptualize Ogopogo, a quad-chiplet design in 7 nm with HBM3 that delivers 10.3DPTs and a node-normalized compute density 19% above Nvidia’s B200. The paper also discusses end-to-end openness, outlining open simulation, EDA, and PDK challenges that must be addressed to realize fully open chiplet ecosystems. Collectively, the results indicate that open-source 2.5D designs can reach competitive HPC/AI performance, while identifying practical bottlenecks toward end-to-end openness.
Abstract
We present a roadmap for open-source chiplet-based RISC-V systems targeting high-performance computing and artificial intelligence, aiming to close the performance gap to proprietary designs. Starting with Occamy, the first open, silicon-proven dual-chiplet RISC-V manycore in 12nm FinFET, we scale to Ramora, a mesh-NoC-based dual-chiplet system, and to Ogopogo, a 7nm quad-chiplet concept architecture achieving state-of-the-art compute density. Finally, we explore possible avenues to extend openness beyond logic-core RTL into simulation, EDA, PDKs, and off-die PHYs.
