SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation

Chakkrit Termritthikun; Ayaz Umer; Suwichaya Suwanwimolkul; Feng Xia; Ivan Lee

SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation

Chakkrit Termritthikun, Ayaz Umer, Suwichaya Suwanwimolkul, Feng Xia, Ivan Lee

TL;DR

SalNAS introduces a weight-sharing neural architecture search framework for saliency prediction by embedding dynamic convolution into a joint encoder-decoder supernet. It adds Self-KD, a teacherless distillation that uses an averaged, cross-validated best subnet as the teacher to improve generalization without gradient cost. Empirically, SalNAS-XL with Self-KD achieves state-of-the-art performance across seven benchmark datasets with about 20.98M parameters and demonstrates favorable real-time metrics. The work provides an end-to-end NAS+distillation pipeline for efficient, scalable saliency prediction suitable for edge devices, with code released.

Abstract

Recent advancements in deep convolutional neural networks have significantly improved the performance of saliency prediction. However, the manual configuration of the neural network architectures requires domain knowledge expertise and can still be time-consuming and error-prone. To solve this, we propose a new Neural Architecture Search (NAS) framework for saliency prediction with two contributions. Firstly, a supernet for saliency prediction is built with a weight-sharing network containing all candidate architectures, by integrating a dynamic convolution into the encoder-decoder in the supernet, termed SalNAS. Secondly, despite the fact that SalNAS is highly efficient (20.98 million parameters), it can suffer from the lack of generalization. To solve this, we propose a self-knowledge distillation approach, termed Self-KD, that trains the student SalNAS with the weighted average information between the ground truth and the prediction from the teacher model. The teacher model, while sharing the same architecture, contains the best-performing weights chosen by cross-validation. Self-KD can generalize well without the need to compute the gradient in the teacher model, enabling an efficient training system. By utilizing Self-KD, SalNAS outperforms other state-of-the-art saliency prediction models in most evaluation rubrics across seven benchmark datasets while being a lightweight model. The code will be available at https://github.com/chakkritte/SalNAS

SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation

TL;DR

Abstract

Paper Structure (22 sections, 8 equations, 5 figures, 9 tables, 1 algorithm)

This paper contains 22 sections, 8 equations, 5 figures, 9 tables, 1 algorithm.

Introduction
Related Work
Compact Saliency Prediction.
Neural Architecture Search
Knowledge Distillation (KD)
Proposed Method
SalNAS
Formulation
Architecture space
Self-knowledge distillation
Loss function
Experimental Results
Baseline
Implementation details
LR scheduler and Combination loss
...and 7 more sections

Figures (5)

Figure 1: Our proposed architecture consists of the encoder and decoder modules that employ dynamic convolutional layers.
Figure 2: The proposed Self-Knowledge Distillation method.
Figure 3: (\ref{['fig:image1']}) shows the correlation coefficient for the smallest subnet (SalNAS-XS), showing superior performance with the Self-KD method. (\ref{['fig:image2']}) shows the correlation coefficient for the largest subnet (SalNAS-XL) shows the Self-KD method's superiority over inplace distillation and the sandwich rule methods.
Figure 4: Qualitative comparison between our model (SalNAS-XL) and other models, including TranSalNet, EfficientNet-B4, and TResNet-M. Images are sourced from the SALICON validation dataset. Rows 1-3 showcase the saliency maps produced by the SalNAS-XL subnet, closely matching ground truth. Rows 4-6 of SalNAS-XL manifest lower cc scores compared to other models.
Figure 5: Exploring the search space of SalNAS: a bubble chart comparing computational complexity (GFLOPS) and correlation coefficient for a sample of 2000 subnets. Bubble size indicates parameter count, and the chart displays various model designs, ranging from SalNAS-XS to SalNAS-XL.

SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation

TL;DR

Abstract

SalNAS: Efficient Saliency-prediction Neural Architecture Search with self-knowledge distillation

Authors

TL;DR

Abstract

Table of Contents

Figures (5)