Table of Contents
Fetching ...

Combining Neural Architecture Search and Automatic Code Optimization: A Survey

Inas Bachiri, Hadjer Benmeziane, Smail Niar, Riyadh Baghdadi, Hamza Ouarnoughi, Abdelkrime Aries

TL;DR

This survey tackles the problem of sub-optimal hardware efficiency when HW-NAS and Automatic Code Optimization are applied in isolation. It proposes NACOS (Hardware Aware-Neural Architecture and Compiler Optimizations co-Search) as a joint co-search framework that simultaneously optimizes neural architectures and their compiler schedules to better reflect hardware performance. The paper provides a taxonomy of two-stage and one-stage NACOS methods, surveys existing approaches, and discusses search strategies, evaluation methodologies, and key challenges such as hardware heterogeneity, generalization, and the lack of benchmarks. By highlighting cross-level synergies and outlining future directions, the work aims to enable more accurate hardware-aware optimization and broader applicability across devices and domains.

Abstract

Deep Learning models have experienced exponential growth in complexity and resource demands in recent years. Accelerating these models for efficient execution on resource-constrained devices has become more crucial than ever. Two notable techniques employed to achieve this goal are Hardware-aware Neural Architecture Search (HW-NAS) and Automatic Code Optimization (ACO). HW-NAS automatically designs accurate yet hardware-friendly neural networks, while ACO involves searching for the best compiler optimizations to apply on neural networks for efficient mapping and inference on the target hardware. This survey explores recent works that combine these two techniques within a single framework. We present the fundamental principles of both domains and demonstrate their sub-optimality when performed independently. We then investigate their integration into a joint optimization process that we call Hardware Aware-Neural Architecture and Compiler Optimizations co-Search (NACOS).

Combining Neural Architecture Search and Automatic Code Optimization: A Survey

TL;DR

This survey tackles the problem of sub-optimal hardware efficiency when HW-NAS and Automatic Code Optimization are applied in isolation. It proposes NACOS (Hardware Aware-Neural Architecture and Compiler Optimizations co-Search) as a joint co-search framework that simultaneously optimizes neural architectures and their compiler schedules to better reflect hardware performance. The paper provides a taxonomy of two-stage and one-stage NACOS methods, surveys existing approaches, and discusses search strategies, evaluation methodologies, and key challenges such as hardware heterogeneity, generalization, and the lack of benchmarks. By highlighting cross-level synergies and outlining future directions, the work aims to enable more accurate hardware-aware optimization and broader applicability across devices and domains.

Abstract

Deep Learning models have experienced exponential growth in complexity and resource demands in recent years. Accelerating these models for efficient execution on resource-constrained devices has become more crucial than ever. Two notable techniques employed to achieve this goal are Hardware-aware Neural Architecture Search (HW-NAS) and Automatic Code Optimization (ACO). HW-NAS automatically designs accurate yet hardware-friendly neural networks, while ACO involves searching for the best compiler optimizations to apply on neural networks for efficient mapping and inference on the target hardware. This survey explores recent works that combine these two techniques within a single framework. We present the fundamental principles of both domains and demonstrate their sub-optimality when performed independently. We then investigate their integration into a joint optimization process that we call Hardware Aware-Neural Architecture and Compiler Optimizations co-Search (NACOS).
Paper Structure (19 sections, 3 equations, 5 figures)

This paper contains 19 sections, 3 equations, 5 figures.

Figures (5)

  • Figure 1: Cross-level joint deep learning optimization methods. In this paper, we explore works that combine HW-NAS and Automatic Code Optimization (NACOS)
  • Figure 2: The inference latency of VGG16 using two different compiler schedules, on 100 image samples from ImageNet imagenet. Schedule 1 and Schedule 2 are obtained by applying different sequences of parallelization, loop tiling, and fusion with various parameters using MLIR mlir, and executing on an Intel Core i7 processor with 32 GB of RAM. The two schedules perform differently on the network; Schedule 1 outperforms Schedule 2 by making the inference on VGG16 faster. The values on the arrow lines represent the acceleration relative to the baseline time, with green indicating acceleration and red indicating deceleration. The baseline time represents the original inference time of VGG16 before applying the schedules.
  • Figure 3: The accelerations of various neural networks under different scheduling strategies. Schedule 1 significantly speeds up VGG16, but does not improve and even worsens performance for other networks. Similarly, Schedule 2 optimizes ResNet18 and ResNet34 but slightly increases VGG16's latency. This demonstrates that a well-optimized schedule for one neural network does not necessarily optimize other networks.
  • Figure 4: Hardware Aware Neural Architecture and Compiler Optimizations co-Search (NACOS) Taxonomy
  • Figure 5: Summary of existing NACOS methods and their characteristics