KBNet: Kernel Basis Network for Image Restoration

Yi Zhang; Dasong Li; Xiaoyu Shi; Dailan He; Kangning Song; Xiaogang Wang; Hongwei Qin; Hongsheng Li

KBNet: Kernel Basis Network for Image Restoration

Yi Zhang, Dasong Li, Xiaoyu Shi, Dailan He, Kangning Song, Xiaogang Wang, Hongwei Qin, Hongsheng Li

TL;DR

KBNet tackles adaptive spatial information aggregation for image restoration by introducing Kernel Basis Attention (KBA), which uses learnable kernel bases and per-pixel fusion to capture diverse local patterns. It couples KBA with a Multi-axis Feature Fusion (MFF) block to jointly encode channel-wise, spatial-invariant, and pixel-adaptive features, all integrated into a U-Net backbone. The approach delivers state-of-the-art results across denoising, deraining, and deblurring benchmarks while reducing computational cost relative to prior SOTA methods. Together, these components provide an efficient framework that blends convolutional inductive biases with adaptive spatial processing for robust low-level vision tasks.

Abstract

How to aggregate spatial information plays an essential role in learning-based image restoration. Most existing CNN-based networks adopt static convolutional kernels to encode spatial information, which cannot aggregate spatial information adaptively. Recent transformer-based architectures achieve adaptive spatial aggregation. But they lack desirable inductive biases of convolutions and require heavy computational costs. In this paper, we propose a kernel basis attention (KBA) module, which introduces learnable kernel bases to model representative image patterns for spatial information aggregation. Different kernel bases are trained to model different local structures. At each spatial location, they are linearly and adaptively fused by predicted pixel-wise coefficients to obtain aggregation weights. Based on the KBA module, we further design a multi-axis feature fusion (MFF) block to encode and fuse channel-wise, spatial-invariant, and pixel-adaptive features for image restoration. Our model, named kernel basis network (KBNet), achieves state-of-the-art performances on more than ten benchmarks over image denoising, deraining, and deblurring tasks while requiring less computational cost than previous SOTA methods.

KBNet: Kernel Basis Network for Image Restoration

TL;DR

Abstract

Paper Structure (16 sections, 2 equations, 10 figures, 9 tables)

This paper contains 16 sections, 2 equations, 10 figures, 9 tables.

Introduction
Related Work
Traditional Methods
CNNs for Image Restoration
Transformers for Image Restoration
Method
Kernel Basis Attention Module
Multi-axis Feature Fusion Block
Intergration of MFF Block into U-Net
Results
Implementation Details
Gaussian Denoising Results
Raw Image Denoising Results
Deraining and Defocus results
Ablation Studies
...and 1 more sections

Figures (10)

Figure 1: An overview of kernel basis attention (KBA) Module. With the input feature map $X$, the KBA module first predicts the fusion coefficient map $F$ to linearly fuse the learnable kernel bases $W$ for each location. Then, the fused kernel weights $M$ adaptively encode the local neighborhood of the enhanced feature map $X_e$ to produce the output feature map $X'$.
Figure 2: An overview of Multi-axis Feature Fusion (MFF) Block. Channel attention, depthwise convolution, and our KBA module process the input features parallelly. The outputs of three operations are fused by point-wise multiplication.
Figure 3: PSNR v.s MACs of different methods on Gaussian denoising of color images. PSNRs are tested on Urban dataset with noise level $\sigma=50$.
Figure 4: Visualization results on Gaussian denoising of color images on Urban100 dataset huang2015single_urban100. KBNet can recover more fine textures
Figure 5: Visualization of denoising results on SenseNoise dataset zhang2021IDR. Our method produces clearer edges and more faithful colors.
...and 5 more figures

KBNet: Kernel Basis Network for Image Restoration

TL;DR

Abstract

KBNet: Kernel Basis Network for Image Restoration

Authors

TL;DR

Abstract

Table of Contents

Figures (10)