RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation
Utkarsh Nath, Yancheng Wang, Yingzhen Yang
TL;DR
RNAS-CL tackles adversarial robustness in neural architecture search by introducing cross-layer knowledge distillation from robust teachers. It jointly searches the student architecture and per-layer tutor mappings using a differentiable framework based on attention-map alignment and Gumbel-Softmax tutor selection, yielding compact architectures with improved robustness without adversarial training. The approach achieves competitive or superior robustness and clean accuracy on CIFAR-10 and ImageNet-100 across multiple teacher models, and ablations confirm the value of intermediate-layer supervision. The work highlights the practical potential of leveraging robust, cross-layer guidance to obtain efficient, resilient neural networks without the cost of robust training, while also suggesting directions to further amplify robustness through training-time enhancements like TRADES.
Abstract
Deep Neural Networks are vulnerable to adversarial attacks. Neural Architecture Search (NAS), one of the driving tools of deep neural networks, demonstrates superior performance in prediction accuracy in various machine learning applications. However, it is unclear how it performs against adversarial attacks. Given the presence of a robust teacher, it would be interesting to investigate if NAS would produce robust neural architecture by inheriting robustness from the teacher. In this paper, we propose Robust Neural Architecture Search by Cross-Layer Knowledge Distillation (RNAS-CL), a novel NAS algorithm that improves the robustness of NAS by learning from a robust teacher through cross-layer knowledge distillation. Unlike previous knowledge distillation methods that encourage close student/teacher output only in the last layer, RNAS-CL automatically searches for the best teacher layer to supervise each student layer. Experimental result evidences the effectiveness of RNAS-CL and shows that RNAS-CL produces small and robust neural architecture.
