Table of Contents
Fetching ...

Toward DNN of LUTs: Learning Efficient Image Restoration with Multiple Look-Up Tables

Jiacheng Li, Chang Chen, Zhen Cheng, Zhiwei Xiong

TL;DR

This paper tackles the efficiency bottleneck of LUT-based image restoration on edge devices by introducing MuLUT, a universal framework that uses multiple LUTs cooperating as a neural network. It achieves linear growth in total LUT size while enlarging the receptive field through complementary, hierarchical, and channel indexing, plus a LUT-aware finetuning strategy. MuLUT delivers significant PSNR gains across super-resolution, demosaicing, denoising, and deblocking with energy efficiency orders of magnitude better than typical DNNs, making it practical for on-device deployment. The approach offers a compelling path toward DNN-like performance with LUT-based inference on edge devices, and the authors provide extensive experiments, ablations, and a public codebase to support adoption and further research.

Abstract

The widespread usage of high-definition screens on edge devices stimulates a strong demand for efficient image restoration algorithms. The way of caching deep learning models in a look-up table (LUT) is recently introduced to respond to this demand. However, the size of a single LUT grows exponentially with the increase of its indexing capacity, which restricts its receptive field and thus the performance. To overcome this intrinsic limitation of the single-LUT solution, we propose a universal method to construct multiple LUTs like a neural network, termed MuLUT. Firstly, we devise novel complementary indexing patterns, as well as a general implementation for arbitrary patterns, to construct multiple LUTs in parallel. Secondly, we propose a re-indexing mechanism to enable hierarchical indexing between cascaded LUTs. Finally, we introduce channel indexing to allow cross-channel interaction, enabling LUTs to process color channels jointly. In these principled ways, the total size of MuLUT is linear to its indexing capacity, yielding a practical solution to obtain superior performance with the enlarged receptive field. We examine the advantage of MuLUT on various image restoration tasks, including super-resolution, demosaicing, denoising, and deblocking. MuLUT achieves a significant improvement over the single-LUT solution, e.g., up to 1.1dB PSNR for super-resolution and up to 2.8dB PSNR for grayscale denoising, while preserving its efficiency, which is 100$\times$ less in energy cost compared with lightweight deep neural networks. Our code and trained models are publicly available at https://github.com/ddlee-cn/MuLUT.

Toward DNN of LUTs: Learning Efficient Image Restoration with Multiple Look-Up Tables

TL;DR

This paper tackles the efficiency bottleneck of LUT-based image restoration on edge devices by introducing MuLUT, a universal framework that uses multiple LUTs cooperating as a neural network. It achieves linear growth in total LUT size while enlarging the receptive field through complementary, hierarchical, and channel indexing, plus a LUT-aware finetuning strategy. MuLUT delivers significant PSNR gains across super-resolution, demosaicing, denoising, and deblocking with energy efficiency orders of magnitude better than typical DNNs, making it practical for on-device deployment. The approach offers a compelling path toward DNN-like performance with LUT-based inference on edge devices, and the authors provide extensive experiments, ablations, and a public codebase to support adoption and further research.

Abstract

The widespread usage of high-definition screens on edge devices stimulates a strong demand for efficient image restoration algorithms. The way of caching deep learning models in a look-up table (LUT) is recently introduced to respond to this demand. However, the size of a single LUT grows exponentially with the increase of its indexing capacity, which restricts its receptive field and thus the performance. To overcome this intrinsic limitation of the single-LUT solution, we propose a universal method to construct multiple LUTs like a neural network, termed MuLUT. Firstly, we devise novel complementary indexing patterns, as well as a general implementation for arbitrary patterns, to construct multiple LUTs in parallel. Secondly, we propose a re-indexing mechanism to enable hierarchical indexing between cascaded LUTs. Finally, we introduce channel indexing to allow cross-channel interaction, enabling LUTs to process color channels jointly. In these principled ways, the total size of MuLUT is linear to its indexing capacity, yielding a practical solution to obtain superior performance with the enlarged receptive field. We examine the advantage of MuLUT on various image restoration tasks, including super-resolution, demosaicing, denoising, and deblocking. MuLUT achieves a significant improvement over the single-LUT solution, e.g., up to 1.1dB PSNR for super-resolution and up to 2.8dB PSNR for grayscale denoising, while preserving its efficiency, which is 100 less in energy cost compared with lightweight deep neural networks. Our code and trained models are publicly available at https://github.com/ddlee-cn/MuLUT.
Paper Structure (27 sections, 4 equations, 16 figures, 14 tables)

This paper contains 27 sections, 4 equations, 16 figures, 14 tables.

Figures (16)

  • Figure 1: Recap of SR-LUT DBLP:conf/cvpr/JoK21. Firstly, a deep super-resolution network is trained. Next, a look-up table (LUT) is obtained by caching the output values of the learned deep super-resolution network by traversing all possible inputs. Finally, the prediction results are retrieved by locating query indexes to the minimum grid and interpolating the pre-computed grid values from the LUT. The indexing entries and corresponding HR values of a 4D LUT for $2\times$ super-resolution are marked in blue and green, respectively. The actual receptive area with the rotation ensemble trick are depicted with dashed lines.
  • Figure 2: (a) For a single LUT, its storage size grows exponentially as the number of covered pixels increases. Our method provides a simple but effective solution to avoid this exponential growth. (b) By enabling cooperation of multiple LUTs, we enlarge the RF from $3 \times 3$ to $9 \times 9$, resulting in a significant performance improvement over SR-LUT while preserving its efficiency. The PSNR values are evaluated on Manga109 for $4\times$ super-resolution.
  • Figure 3: Overview of MuLUT. (a) With complementary, hierarchical, and channel indexing, LUTs can be constructed in a general and flexible way like neural networks. These LUTs are learned with a MuLUTNet, which is composed of multiple MuLUT blocks in a shared structure. After training, the inputs and outputs of each MuLUT block are cached into a LUT, while the computation graph is retained. (b) We design different types of MuLUT blocks to construct the learnable MuLUTNet. The convolution layer is denoted in the format of $\mathtt{kernel\_width} \times \mathtt{kernel\_height}(\mathtt{input\_channel} \rightarrow \mathtt{output\_channel})$. The connecting lines denotes the dense connection DBLP:conf/cvpr/HuangLMW17.
  • Figure 4: Complementary indexing of multiple LUTs. With the proposed novel indexing patterns, MuLUT involves more pixels than a single SR-LUT. For example, With MuLUT-S, MuLUT-D, and MuLUT-Y, the $5 \times 5$ area around $I_0$ is fully covered. The covered pixels with the rotation ensemble trick are marked with dashed boxes.
  • Figure 5: The general implementation of obtaining arbitrary patterns.
  • ...and 11 more figures