Robust MAE-Driven NAS: From Mask Reconstruction to Architecture Innovation

Yiming Hu; Xiangxiang Chu; Yong Wang

Robust MAE-Driven NAS: From Mask Reconstruction to Architecture Innovation

Yiming Hu, Xiangxiang Chu, Yong Wang

TL;DR

The paper tackles the data-label bottleneck in neural architecture search by introducing MAE-NAS, an unsupervised NAS framework that uses masked autoencoding to replace supervised objectives within the DARTS search space. It formalizes a bi-level optimization where a masked reconstruction loss guides architecture selection, and introduces a lightweight hierarchical decoder to prevent performance collapse that plagues unsupervised DARTS. Empirical results on CIFAR-10 and ImageNet show competitive or superior accuracy with lower search costs compared to both supervised and prior unsupervised NAS methods, and the method demonstrates robustness to mask ratio and patch size. This work offers a practical pathway to label-free NAS with strong generalization, potentially extending beyond image classification to other vision tasks.

Abstract

Neural Architecture Search (NAS) relies heavily on labeled data, which is labor-intensive and time-consuming to obtain. In this paper, we propose a novel NAS method based on an unsupervised paradigm, specifically Masked Autoencoders (MAE), thereby eliminating the need for labeled data. By replacing the supervised learning objective with an image reconstruction task, our approach enables the efficient discovery of network architectures without compromising performance and generalization ability. Additionally, we address the problem of performance collapse encountered in the widely-used Differentiable Architecture Search (DARTS) in the unsupervised setting by designing a hierarchical decoder. Extensive experiments across various datasets demonstrate the effectiveness and robustness of our method, offering empirical evidence of its superiority over the counterparts.

Robust MAE-Driven NAS: From Mask Reconstruction to Architecture Innovation

TL;DR

Abstract

Paper Structure (12 sections, 3 equations, 2 figures, 3 tables)

This paper contains 12 sections, 3 equations, 2 figures, 3 tables.

Introduction
Method
DARTS Enhanced with Masked Autoencoders
Hierarchical Decoder
Relationship to Prior Works
Experiments
Comparisons with State-of-the-art Methods
Sensitivity Analysis of Mask Ratio and Patch Size
Ablation of Hierarchical Decoder
Visualization of Image Reconstruction
Related Work
Conclusion

Figures (2)

Figure 1: The framework of MAE-NAS. The input is an image with an applied mask, which is first fed into an encoder and then passed a hierarchical decoder, ultimately producing a reconstructed image. The encoder section is built on the search space for NAS, aimed at selecting a superior architecture to enhance the quality of the reconstructed image.
Figure 2: Comparison of the original images (a) and the reconstructed images on ImageNet. The second and third rows represent the reconstructed images of the MAE-NAS supernet under the settings of w/ HD (b) and w/o HD (c) respectively.

Robust MAE-Driven NAS: From Mask Reconstruction to Architecture Innovation

TL;DR

Abstract

Robust MAE-Driven NAS: From Mask Reconstruction to Architecture Innovation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)