Table of Contents
Fetching ...

MGAS: Multi-Granularity Architecture Search for Trade-Off Between Model Effectiveness and Efficiency

Xiaoyun Liu, Divya Saxena, Jiannong Cao, Yuqing Zhao, Penghui Ruan

TL;DR

MG-DARTS introduces a unified, memory-efficient differentiable NAS framework that searches across multiple granularity levels—operation, filter, and weight—to optimize the accuracy-density trade-off. It advances two core ideas: adaptive pruning with granularity-specific discretization and a multi-stage search with progressive re-evaluation to mitigate bias. Empirical results on CIFAR-10, CIFAR-100, and ImageNet demonstrate stronger accuracy-density and memory efficiency relative to state-of-the-art baselines, across both DARTS-like and MobileNet-like search spaces. The approach yields architectures with robust transferability and scalable memory savings during search, offering practical benefits for AutoML deployments in resource-constrained settings.

Abstract

Neural architecture search (NAS) has gained significant traction in automating the design of neural networks. To reduce search time, differentiable architecture search (DAS) reframes the traditional paradigm of discrete candidate sampling and evaluation into a differentiable optimization over a super-net, followed by discretization. However, most existing DAS methods primarily focus on optimizing the coarse-grained operation-level topology, while neglecting finer-grained structures such as filter-level and weight-level patterns. This limits their ability to balance model performance with model size. Additionally, many methods compromise search quality to save memory during the search process. To tackle these issues, we propose Multi-Granularity Differentiable Architecture Search (MG-DARTS), a unified framework which aims to discover both effective and efficient architectures from scratch by comprehensively yet memory-efficiently exploring a multi-granularity search space. Specifically, we improve the existing DAS methods in two aspects. First, we adaptively adjust the retention ratios of searchable units across different granularity levels through adaptive pruning, which is achieved by learning granularity-specific discretization functions along with the evolving architecture. Second, we decompose the super-net optimization and discretization into multiple stages, each operating on a sub-net, and introduce progressive re-evaluation to enable re-pruning and regrowth of previous units, thereby mitigating potential bias. Extensive experiments on CIFAR-10, CIFAR-100 and ImageNet demonstrate that MG-DARTS outperforms other state-of-the-art methods in achieving a better trade-off between model accuracy and parameter efficiency. Codes are available at https://github.com/lxy12357/MG_DARTS.

MGAS: Multi-Granularity Architecture Search for Trade-Off Between Model Effectiveness and Efficiency

TL;DR

MG-DARTS introduces a unified, memory-efficient differentiable NAS framework that searches across multiple granularity levels—operation, filter, and weight—to optimize the accuracy-density trade-off. It advances two core ideas: adaptive pruning with granularity-specific discretization and a multi-stage search with progressive re-evaluation to mitigate bias. Empirical results on CIFAR-10, CIFAR-100, and ImageNet demonstrate stronger accuracy-density and memory efficiency relative to state-of-the-art baselines, across both DARTS-like and MobileNet-like search spaces. The approach yields architectures with robust transferability and scalable memory savings during search, offering practical benefits for AutoML deployments in resource-constrained settings.

Abstract

Neural architecture search (NAS) has gained significant traction in automating the design of neural networks. To reduce search time, differentiable architecture search (DAS) reframes the traditional paradigm of discrete candidate sampling and evaluation into a differentiable optimization over a super-net, followed by discretization. However, most existing DAS methods primarily focus on optimizing the coarse-grained operation-level topology, while neglecting finer-grained structures such as filter-level and weight-level patterns. This limits their ability to balance model performance with model size. Additionally, many methods compromise search quality to save memory during the search process. To tackle these issues, we propose Multi-Granularity Differentiable Architecture Search (MG-DARTS), a unified framework which aims to discover both effective and efficient architectures from scratch by comprehensively yet memory-efficiently exploring a multi-granularity search space. Specifically, we improve the existing DAS methods in two aspects. First, we adaptively adjust the retention ratios of searchable units across different granularity levels through adaptive pruning, which is achieved by learning granularity-specific discretization functions along with the evolving architecture. Second, we decompose the super-net optimization and discretization into multiple stages, each operating on a sub-net, and introduce progressive re-evaluation to enable re-pruning and regrowth of previous units, thereby mitigating potential bias. Extensive experiments on CIFAR-10, CIFAR-100 and ImageNet demonstrate that MG-DARTS outperforms other state-of-the-art methods in achieving a better trade-off between model accuracy and parameter efficiency. Codes are available at https://github.com/lxy12357/MG_DARTS.
Paper Structure (35 sections, 11 equations, 9 figures, 10 tables, 2 algorithms)

This paper contains 35 sections, 11 equations, 9 figures, 10 tables, 2 algorithms.

Figures (9)

  • Figure 1: Illustration of the accuracy-parameter trade-off with respect to different retention ratios for units at different granularity levels (operation, weight and filters) on CIFAR-10 dataset. In (a), we maintain a constant operation number and manipulate the filter number and sparsity ratio, while ensuring the model parameters remain unchanged at approximately 2M. In (b), we keep the sparsity ratio fixed and modify the operation number and filter number, while maintaining the model parameters at around 2.3M. We observe that model accuracy varies significantly even when model sizes are similar. This underscores the necessity of effectively balancing the units of different granularities.
  • Figure 2: Illustration of the single-granularity and multi-granularity search space. The multi-granularity search space allows for the exploration of fine-grained filter-level and weight-level units, which enables a greater reduction of potential redundant parameters and facilitates the discovery of more light-weight yet effective models.
  • Figure 3: Overview of pruning with fixed pruning rates and adaptive pruning. The dotted arrows with different colours represent the pruning process at different granularity levels. The width of the dotted arrow indicates the number of the pruned units. Existing works independently prune on different granularities with fixed pruning rates. In contrast, our adaptive pruning learns granularity-specific discretization functions to adaptively determine the pruning ratio at each granularity level according to the architecture evolution, so that the retention ratios at different levels can be optimized for models with different sizes.
  • Figure 4: Illustration of multi-stage search. We decompose the super-net optimization and discretization into multiple sub-net stages to save memory, and enable further pruning and regrowing of the units in previous sub-nets during subsequent stages to reduce bias.
  • Figure 5: Comparison with different manual ratios. Adaptive pruning effectively identifies configurations that lie within high-performance region.
  • ...and 4 more figures