Table of Contents
Fetching ...

ActNAS : Generating Efficient YOLO Models using Activation NAS

Sudhakar Sah, Ravish Kumar, Darshan C. Ganji, Ehsan Saboori

TL;DR

This work proposes Activation NAS (Act-NAS)-a Hardware-Aware Neural Architecture Search (HANAS) method that optimizes activation functions per layer for specific hardware, and demonstrates that hardware-aware models learn to leverage architectural and compiler-level optimizations, resulting in highly efficient performance tailored to each hardware platform.

Abstract

Activation functions introduce non-linearity into Neural Networks, enabling them to learn complex patterns. Different activation functions vary in speed and accuracy, ranging from faster but less accurate options like ReLU to slower but more accurate functions like SiLU or SELU. Typically, same activation function is used throughout an entire model architecture. In this paper, we conduct a comprehensive study on the effects of using mixed activation functions in YOLO-based models, evaluating their impact on latency, memory usage, and accuracy across CPU, NPU, and GPU edge devices. We also propose a novel approach that leverages Neural Architecture Search (NAS) to design YOLO models with optimized mixed activation functions.The best model generated through this method demonstrates a slight improvement in mean Average Precision (mAP) compared to baseline model (SiLU), while it is 22.28% faster and consumes 64.15% less memory on the reference NPU device.

ActNAS : Generating Efficient YOLO Models using Activation NAS

TL;DR

This work proposes Activation NAS (Act-NAS)-a Hardware-Aware Neural Architecture Search (HANAS) method that optimizes activation functions per layer for specific hardware, and demonstrates that hardware-aware models learn to leverage architectural and compiler-level optimizations, resulting in highly efficient performance tailored to each hardware platform.

Abstract

Activation functions introduce non-linearity into Neural Networks, enabling them to learn complex patterns. Different activation functions vary in speed and accuracy, ranging from faster but less accurate options like ReLU to slower but more accurate functions like SiLU or SELU. Typically, same activation function is used throughout an entire model architecture. In this paper, we conduct a comprehensive study on the effects of using mixed activation functions in YOLO-based models, evaluating their impact on latency, memory usage, and accuracy across CPU, NPU, and GPU edge devices. We also propose a novel approach that leverages Neural Architecture Search (NAS) to design YOLO models with optimized mixed activation functions.The best model generated through this method demonstrates a slight improvement in mean Average Precision (mAP) compared to baseline model (SiLU), while it is 22.28% faster and consumes 64.15% less memory on the reference NPU device.

Paper Structure

This paper contains 15 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Latency-mAP plot of YOLO5n and YOLO8m for two different NPUs. SiLU models are slowest and ReLU models are fastest on both devices
  • Figure 2: Layer wise activation replacement a) Replace first SiLU activation to ReLU b) Replace second activation c) Replace third activation
  • Figure 3: mAP vs NWOT cost for 131 MCUbenchsah2024mcubench Models. Each point represents one model and NWOT and mAP score of each model is plotted in orange and blue colors respectively
  • Figure 4: Model Benchmarking using training free estimators and on device inference. Each candidate model is created by replacing just one activation function.
  • Figure 5: Activation NAS Process
  • ...and 3 more figures