Table of Contents
Fetching ...

MIREDO: MIP-Driven Resource-Efficient Dataflow Optimization for Computing-in-Memory Accelerator

Xiaolin He, Cenlin Duan, Yingjie Qi, Xiao Ma, Jianlei Yang

TL;DR

This work addresses the data movement bottlenecks in computing-in-memory (CIM) DNN accelerators by formulating dataflow optimization as a Mixed-Integer Programming (MIP) problem. It introduces MIREDO, a framework featuring a hierarchical CIM architecture abstraction and an analytical latency model to jointly optimize workload tiling, loop permutation, and CIM-specific constraints, including flexible, uneven mappings. A Flexible Factorization technique and a comprehensive set of constants, variables, and constraints enable efficient single-shot optimization with robust data locality considerations. Empirical results show up to $3.2\times$ improvement in Energy-Delay Product (EDP) across diverse DNN models and hardware setups, demonstrating the approach’s adaptability and effectiveness for system-level CIM optimization. Overall, MIREDO advances CIM efficiency by enabling accurate latency estimation and scalable, hardware-aware dataflow search that yields substantial performance gains in constrained environments.

Abstract

Computing-in-Memory (CIM) architectures have emerged as a promising solution for accelerating Deep Neural Networks (DNNs) by mitigating data movement bottlenecks. However, realizing the potential of CIM requires specialized dataflow optimizations, which are challenged by an expansive design space and strict architectural constraints. Existing optimization approaches often fail to fully exploit CIM accelerators, leading to noticeable gaps between theoretical and actual system-level efficiency. To address these limitations, we propose the MIREDO framework, which formulates dataflow optimization as a Mixed-Integer Programming (MIP) problem. MIREDO introduces a hierarchical hardware abstraction coupled with an analytical latency model designed to accurately reflect the complex data transfer behaviors within CIM systems. By jointly modeling workload characteristics, dataflow strategies, and CIM-specific constraints, MIREDO systematically navigates the vast design space to determine the optimal dataflow configurations. Evaluation results demonstrate that MIREDO significantly enhances performance, achieving up to $3.2\times$ improvement across various DNN models and hardware setups.

MIREDO: MIP-Driven Resource-Efficient Dataflow Optimization for Computing-in-Memory Accelerator

TL;DR

This work addresses the data movement bottlenecks in computing-in-memory (CIM) DNN accelerators by formulating dataflow optimization as a Mixed-Integer Programming (MIP) problem. It introduces MIREDO, a framework featuring a hierarchical CIM architecture abstraction and an analytical latency model to jointly optimize workload tiling, loop permutation, and CIM-specific constraints, including flexible, uneven mappings. A Flexible Factorization technique and a comprehensive set of constants, variables, and constraints enable efficient single-shot optimization with robust data locality considerations. Empirical results show up to improvement in Energy-Delay Product (EDP) across diverse DNN models and hardware setups, demonstrating the approach’s adaptability and effectiveness for system-level CIM optimization. Overall, MIREDO advances CIM efficiency by enabling accurate latency estimation and scalable, hardware-aware dataflow search that yields substantial performance gains in constrained environments.

Abstract

Computing-in-Memory (CIM) architectures have emerged as a promising solution for accelerating Deep Neural Networks (DNNs) by mitigating data movement bottlenecks. However, realizing the potential of CIM requires specialized dataflow optimizations, which are challenged by an expansive design space and strict architectural constraints. Existing optimization approaches often fail to fully exploit CIM accelerators, leading to noticeable gaps between theoretical and actual system-level efficiency. To address these limitations, we propose the MIREDO framework, which formulates dataflow optimization as a Mixed-Integer Programming (MIP) problem. MIREDO introduces a hierarchical hardware abstraction coupled with an analytical latency model designed to accurately reflect the complex data transfer behaviors within CIM systems. By jointly modeling workload characteristics, dataflow strategies, and CIM-specific constraints, MIREDO systematically navigates the vast design space to determine the optimal dataflow configurations. Evaluation results demonstrate that MIREDO significantly enhances performance, achieving up to improvement across various DNN models and hardware setups.

Paper Structure

This paper contains 17 sections, 14 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: Hierarchical abstraction of the oriented CIM accelerator.
  • Figure 2: Representative data-transfer timelines illustrating (a) mode-switch stalls of the CIM macro, (b) pipeline stall due to throughput mismatch, and (c) operand-synchronization stalls.
  • Figure 3: Overview of the proposed MIREDO framework.
  • Figure 4: MIREDO performance evaluation. (a) Analytical model accuracy validation. (b) Utilization and EDP comparison. (c) Per-layer and overall speedup comparison.
  • Figure 5: Performance comparison across various DNN models and hardware configurations.