Table of Contents
Fetching ...

SynDCIM: A Performance-Aware Digital Computing-in-Memory Compiler with Multi-Spec-Oriented Subcircuit Synthesis

Kunming Shao, Fengshi Tian, Xiaomeng Wang, Jiakun Zheng, Jia Chen, Jingyu He, Hui Wu, Jinbo Chen, Xihao Guan, Yi Deng, Fengbin Tu, Jie Yang, Mohamad Sawan, Tim Kwang-Ting Cheng, Chi-Ying Tsui

TL;DR

The paper tackles the lack of automation in Digital Computing-in-Memory (DCIM) macro design by proposing SynDCIM, a performance-aware compiler that performs multi-spec-oriented subcircuit synthesis. It introduces a scalable subcircuit library, a multi-spec-oriented searcher, and an SDP-based automatic place-and-route flow to generate DCIM macros that optimize throughput, latency, power, and area for user-defined specifications. The approach yields a Pareto frontier of architectural designs and is validated through post-layout simulations and silicon measurements on a 40 nm test chip, achieving competitive energy and area efficiency relative to state-of-the-art manually designed macros. This work enables agile, specification-driven deployment of DCIM macros across diverse AI workloads and platforms, bridging the gap between high-level performance targets and manufacturable, silicon-verified implementations.

Abstract

Digital Computing-in-Memory (DCIM) is an innovative technology that integrates multiply-accumulation (MAC) logic directly into memory arrays to enhance the performance of modern AI computing. However, the need for customized memory cells and logic components currently necessitates significant manual effort in DCIM design. Existing tools for facilitating DCIM macro designs struggle to optimize subcircuit synthesis to meet user-defined performance criteria, thereby limiting the potential system-level acceleration that DCIM can offer. To address these challenges and enable agile design of DCIM macros with optimal architectures, we present SynDCIM, a performance-aware DCIM compiler that employs multi-spec-oriented subcircuit synthesis. SynDCIM features an automated performance-to-layout generation process that aligns with user-defined performance expectations. This is supported by a scalable subcircuit library and a multi-spec-oriented searching algorithm for effective subcircuit synthesis. The effectiveness of SynDCIM is demonstrated through extensive experiments and validated with a test chip fabricated in a 40nm CMOS process. Testing results reveal that designs generated by SynDCIM exhibit competitive performance when compared to state-of-the-art manually designed DCIM macros.

SynDCIM: A Performance-Aware Digital Computing-in-Memory Compiler with Multi-Spec-Oriented Subcircuit Synthesis

TL;DR

The paper tackles the lack of automation in Digital Computing-in-Memory (DCIM) macro design by proposing SynDCIM, a performance-aware compiler that performs multi-spec-oriented subcircuit synthesis. It introduces a scalable subcircuit library, a multi-spec-oriented searcher, and an SDP-based automatic place-and-route flow to generate DCIM macros that optimize throughput, latency, power, and area for user-defined specifications. The approach yields a Pareto frontier of architectural designs and is validated through post-layout simulations and silicon measurements on a 40 nm test chip, achieving competitive energy and area efficiency relative to state-of-the-art manually designed macros. This work enables agile, specification-driven deployment of DCIM macros across diverse AI workloads and platforms, bridging the gap between high-level performance targets and manufacturable, silicon-verified implementations.

Abstract

Digital Computing-in-Memory (DCIM) is an innovative technology that integrates multiply-accumulation (MAC) logic directly into memory arrays to enhance the performance of modern AI computing. However, the need for customized memory cells and logic components currently necessitates significant manual effort in DCIM design. Existing tools for facilitating DCIM macro designs struggle to optimize subcircuit synthesis to meet user-defined performance criteria, thereby limiting the potential system-level acceleration that DCIM can offer. To address these challenges and enable agile design of DCIM macros with optimal architectures, we present SynDCIM, a performance-aware DCIM compiler that employs multi-spec-oriented subcircuit synthesis. SynDCIM features an automated performance-to-layout generation process that aligns with user-defined performance expectations. This is supported by a scalable subcircuit library and a multi-spec-oriented searching algorithm for effective subcircuit synthesis. The effectiveness of SynDCIM is demonstrated through extensive experiments and validated with a test chip fabricated in a 40nm CMOS process. Testing results reveal that designs generated by SynDCIM exhibit competitive performance when compared to state-of-the-art manually designed DCIM macros.

Paper Structure

This paper contains 13 sections, 10 figures, 2 tables, 1 algorithm.

Figures (10)

  • Figure 1: Illustration of emerging DCIM architectures and key subcircuits.
  • Figure 2: Overall framework of the proposed performance-aware DCIM compiler, SynDCIM.
  • Figure 3: Illustration of the proposed DCIM subcircuit library for synthesis.
  • Figure 4: Proposed bit-wise carry-save-adders: principle and circuit details.
  • Figure 5: Illustration of the proposed optimization techniques for adders.
  • ...and 5 more figures