Table of Contents
Fetching ...

Multi-Objective Neural Architecture Search for In-Memory Computing

Md Hasibul Amin, Mohammadreza Mohammadi, Ramtin Zand

TL;DR

The paper addresses efficient deployment of CNNs on analog IMC by employing NAS to optimize for accuracy and hardware metrics, exploring a vast search space of CNN configurations built from VGG, MVGG, and ResNet-inspired blocks. It uses Bayesian optimization via Hyperopt integrated with PyTorch and MNSIM 2.0 to evaluate CNNs on ASL, CK+, and CIFAR-10, revealing how fitness function choice shapes architectural depth and kernel counts. Key findings show accuracy-only objectives favor deeper, more feature-rich models, while latency and energy objectives yield shallower networks and fewer kernels; energy-based optimization tends to avoid RES blocks due to the global accumulator. The results demonstrate the viability of NAS for hardware-aware CNN deployment on IMC and highlight important design trade-offs for achieving high accuracy with low latency and energy.

Abstract

In this work, we employ neural architecture search (NAS) to enhance the efficiency of deploying diverse machine learning (ML) tasks on in-memory computing (IMC) architectures. Initially, we design three fundamental components inspired by the convolutional layers found in VGG and ResNet models. Subsequently, we utilize Bayesian optimization to construct a convolutional neural network (CNN) model with adaptable depths, employing these components. Through the Bayesian search algorithm, we explore a vast search space comprising over 640 million network configurations to identify the optimal solution, considering various multi-objective cost functions like accuracy/latency and accuracy/energy. Our evaluation of this NAS approach for IMC architecture deployment spans three distinct image classification datasets, demonstrating the effectiveness of our method in achieving a balanced solution characterized by high accuracy and reduced latency and energy consumption.

Multi-Objective Neural Architecture Search for In-Memory Computing

TL;DR

The paper addresses efficient deployment of CNNs on analog IMC by employing NAS to optimize for accuracy and hardware metrics, exploring a vast search space of CNN configurations built from VGG, MVGG, and ResNet-inspired blocks. It uses Bayesian optimization via Hyperopt integrated with PyTorch and MNSIM 2.0 to evaluate CNNs on ASL, CK+, and CIFAR-10, revealing how fitness function choice shapes architectural depth and kernel counts. Key findings show accuracy-only objectives favor deeper, more feature-rich models, while latency and energy objectives yield shallower networks and fewer kernels; energy-based optimization tends to avoid RES blocks due to the global accumulator. The results demonstrate the viability of NAS for hardware-aware CNN deployment on IMC and highlight important design trade-offs for achieving high accuracy with low latency and energy.

Abstract

In this work, we employ neural architecture search (NAS) to enhance the efficiency of deploying diverse machine learning (ML) tasks on in-memory computing (IMC) architectures. Initially, we design three fundamental components inspired by the convolutional layers found in VGG and ResNet models. Subsequently, we utilize Bayesian optimization to construct a convolutional neural network (CNN) model with adaptable depths, employing these components. Through the Bayesian search algorithm, we explore a vast search space comprising over 640 million network configurations to identify the optimal solution, considering various multi-objective cost functions like accuracy/latency and accuracy/energy. Our evaluation of this NAS approach for IMC architecture deployment spans three distinct image classification datasets, demonstrating the effectiveness of our method in achieving a balanced solution characterized by high accuracy and reduced latency and energy consumption.
Paper Structure (5 sections, 7 figures, 2 tables)

This paper contains 5 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: (a) The analog IMC architecture with multiple IMC banks each of which includes several interconnected IMC tiles MNSIM2. (b) The IMC tile consists of a network of tightly coupled processing elements (PEs). (c) The structure of the IMC processing element that includes memristive crossbars in its core.
  • Figure 2: The proposed NAS methodology for multi-objective optimization of ML workloads deployed on analog IMC architectures.
  • Figure 3: The three building blocks of the CNN architecture.
  • Figure 4: The network configuration. ReLU, Dropout, Softmax.
  • Figure 5: Distribution of NAS outputs for ASL dataset (The best model is marked by $\star$) (a) distribution of accuracy when $FF=accuracy$ (b) distribution of accuracy vs latency when $FF=accuracy/latency$ (c) distribution of accuracy vs energy when $FF=accuracy/energy$.
  • ...and 2 more figures