QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise

Egor Shvetsov; Dmitry Osin; Alexey Zaytsev; Ivan Koryakovskiy; Valentin Buchnev; Ilya Trofimov; Evgeny Burnaev

QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise

Egor Shvetsov, Dmitry Osin, Alexey Zaytsev, Ivan Koryakovskiy, Valentin Buchnev, Ilya Trofimov, Evgeny Burnaev

TL;DR

QuantNAS addresses the need for efficient, high-quality SR models by marrying neural architecture search with mixed-precision quantization in a quantization-aware framework. It introduces ADQ to stabilize SR blocks, SAN to enable differentiable quantization-robust optimization, and an entropy-regularized loss to encourage single-path architectures within a differentiable NAS setting. The method demonstrates superior PSNR/BitOps Pareto fronts over fixed-quantization baselines and NAS with fixed bits, using two SR-inspired search spaces (Basic and RFDN) and showing speedups up to 30% in search time. The results indicate that a carefully designed search space plus quantization-aware training yields architectures that are both more accurate and more hardware-efficient, with practical implications for deploying SR on resource-constrained devices.

Abstract

There is a constant need for high-performing and computationally efficient neural network models for image super-resolution: computationally efficient models can be used via low-capacity devices and reduce carbon footprints. One way to obtain such models is to compress models, e.g. quantization. Another way is a neural architecture search that automatically discovers new, more efficient solutions. We propose a novel quantization-aware procedure, the QuantNAS that combines pros of these two approaches. To make QuantNAS work, the procedure looks for quantization-friendly super-resolution models. The approach utilizes entropy regularization, quantization noise, and Adaptive Deviation for Quantization (ADQ) module to enhance the search procedure. The entropy regularization technique prioritizes a single operation within each block of the search space. Adding quantization noise to parameters and activations approximates model degradation after quantization, resulting in a more quantization-friendly architectures. ADQ helps to alleviate problems caused by Batch Norm blocks in super-resolution models. Our experimental results show that the proposed approximations are better for search procedure than direct model quantization. QuantNAS discovers architectures with better PSNR/BitOps trade-off than uniform or mixed precision quantization of fixed architectures. We showcase the effectiveness of our method through its application to two search spaces inspired by the state-of-the-art SR models and RFDN. Thus, anyone can design a proper search space based on an existing architecture and apply our method to obtain better quality and efficiency. The proposed procedure is 30\% faster than direct weight quantization and is more stable.

QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise

TL;DR

Abstract

Paper Structure (43 sections, 10 equations, 18 figures, 4 tables, 1 algorithm)

This paper contains 43 sections, 10 equations, 18 figures, 4 tables, 1 algorithm.

Introduction
Related works
Methodology
Search space design
ADQ module
Quantization-Aware Training - QAT
Mixed precision Search and BitMixer
Quantization-Aware Search with Shared Weights (SW)
Quantization-Aware Search Against Noise (SAN)
The search procedure
Hardware constraint regularization
Entropy regularization
Summary
Results
Evaluation protocol
...and 28 more sections

Figures (18)

Figure 1: The example of an overparametrized search space suitable for NAS. An overparametrized supernet is a graph. In this graph, multiple possible operation edges connect nodes that are outputs of each layer. The $\alpha$ values represent the edge importance. The joint training of operation parameters and their importance allow for differentiable NAS. The final architecture is the result of the selection of edges with the highest importance between each consecutive pair of nodes. The selected edges are marked with solid lines, composing a final neural network architecture.
Figure 2: SAN approach for a single layer A function $QNoise(b)$ generates quantization noise. $WR$ are real valued weights, $WQ$ are output pseudo quantized weights, and $\boldsymbol{\alpha}$ is a vector of trainable parameters. By adjusting $\boldsymbol{\alpha}$, we search for acceptable model degradation caused by quantization procedure. $QNoise(b)$ is independent of weights and allows for propagation of gradients. For quantization-aware search, each blue operation on Figure \ref{['fig:dag_supernet']} becomes SAN operation with noisy weights.
Figure 3: The search space design. We separate the whole architecture into $4$ parts: head, body, upsample, and tail. The head and the tail have $N = 2$ convolutional layers. The identical body part is repeated $K = 3$ times, unless specified otherwise. The number of channels for all the blocks equals $36$, except for the head's first layer, upsample, and the tail's first layers. All the blocks with skip connections incorporate ADQ.
Figure 4: Comparison of ADQ with AdaDM AdaDM. Some Block represents any residual block with several layers within, $\sigma(X_{in})$ is a variance of input signal, $\gamma$ and $\beta$ are learnable scalars. We remove the second BN after $X_{out}$ from original AdaDM.
Figure 5: Our quantization-aware QuantNAS approach vs. fixed quantized architectures. PSNR is for Set14 dataset and BitOPs is for image size 32x32. We aim at the upper left corner that corresponds to smaller GBitOps and higher quality measure via PSNR.
...and 13 more figures

QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise

TL;DR

Abstract

QuantNAS for super resolution: searching for efficient quantization-friendly architectures against quantization noise

Authors

TL;DR

Abstract

Table of Contents

Figures (18)