Table of Contents
Fetching ...

ApproxDARTS: Differentiable Neural Architecture Search with Approximate Multipliers

Michal Pinos, Lukas Sekanina, Vojtech Mrazek

TL;DR

ApproxDARTS merges differentiable neural architecture search with approximate hardware primitives to reduce energy consumption in CNN inference. It leverages the TFApprox4IL emulator and EvoApproxLib 8-bit multipliers to substitute selective convolutions with approximate operations during both the DARTS search and final training. On CIFAR-10, ApproxDARTS performs architecture search in under 10 GPU hours and achieves up to 53.84% energy savings in arithmetic operations with negligible accuracy loss, while remaining competitive with EvoApproxNAS. This work demonstrates a practical path to hardware-aware NAS that co-designs CNN architectures with approximate arithmetic for energy-efficient deployment.

Abstract

Integrating the principles of approximate computing into the design of hardware-aware deep neural networks (DNN) has led to DNNs implementations showing good output quality and highly optimized hardware parameters such as low latency or inference energy. In this work, we present ApproxDARTS, a neural architecture search (NAS) method enabling the popular differentiable neural architecture search method called DARTS to exploit approximate multipliers and thus reduce the power consumption of generated neural networks. We showed on the CIFAR-10 data set that the ApproxDARTS is able to perform a complete architecture search within less than $10$ GPU hours and produce competitive convolutional neural networks (CNN) containing approximate multipliers in convolutional layers. For example, ApproxDARTS created a CNN showing an energy consumption reduction of (a) $53.84\%$ in the arithmetic operations of the inference phase compared to the CNN utilizing the native $32$-bit floating-point multipliers and (b) $5.97\%$ compared to the CNN utilizing the exact $8$-bit fixed-point multipliers, in both cases with a negligible accuracy drop. Moreover, the ApproxDARTS is $2.3\times$ faster than a similar but evolutionary algorithm-based method called EvoApproxNAS.

ApproxDARTS: Differentiable Neural Architecture Search with Approximate Multipliers

TL;DR

ApproxDARTS merges differentiable neural architecture search with approximate hardware primitives to reduce energy consumption in CNN inference. It leverages the TFApprox4IL emulator and EvoApproxLib 8-bit multipliers to substitute selective convolutions with approximate operations during both the DARTS search and final training. On CIFAR-10, ApproxDARTS performs architecture search in under 10 GPU hours and achieves up to 53.84% energy savings in arithmetic operations with negligible accuracy loss, while remaining competitive with EvoApproxNAS. This work demonstrates a practical path to hardware-aware NAS that co-designs CNN architectures with approximate arithmetic for energy-efficient deployment.

Abstract

Integrating the principles of approximate computing into the design of hardware-aware deep neural networks (DNN) has led to DNNs implementations showing good output quality and highly optimized hardware parameters such as low latency or inference energy. In this work, we present ApproxDARTS, a neural architecture search (NAS) method enabling the popular differentiable neural architecture search method called DARTS to exploit approximate multipliers and thus reduce the power consumption of generated neural networks. We showed on the CIFAR-10 data set that the ApproxDARTS is able to perform a complete architecture search within less than GPU hours and produce competitive convolutional neural networks (CNN) containing approximate multipliers in convolutional layers. For example, ApproxDARTS created a CNN showing an energy consumption reduction of (a) in the arithmetic operations of the inference phase compared to the CNN utilizing the native -bit floating-point multipliers and (b) compared to the CNN utilizing the exact -bit fixed-point multipliers, in both cases with a negligible accuracy drop. Moreover, the ApproxDARTS is faster than a similar but evolutionary algorithm-based method called EvoApproxNAS.
Paper Structure (18 sections, 3 equations, 4 figures, 3 tables)

This paper contains 18 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Three dimensions of NAS, i.e. Search Space defining a space of all architectures that can be explored, Search Method, implementing the algorithm for the architecture space exploration, and Performance Estimation Strategy, responsible for the evaluation of candidate architectures.
  • Figure 2: ApproxDARTS (right) replaces the $\{3\times3,\ 5\times5\}$ dilated convolutions and $\{3\times3,\ 5\times5\}$ separable convolutions of the original DARTS (left) with their approximate counterparts from the TFApprox4IL framework. All other operations (i.e., max pooling, avg pooling, zero, and skip connect) remain unchanged.
  • Figure 3: The two stages of the DARTS method.
  • Figure 4: Best performing normal and reduction cells obtained during the architecture search stage of the ApproxDARTS for the NGR (\ref{['normal_cells_a']}, \ref{['reduction_cells_b']}) and 2AC (\ref{['normal_cells_c']}, \ref{['reduction_cells_d']}) approximate multipliers.