Table of Contents
Fetching ...

CIM-NET: A Video Denoising Deep Neural Network Model Optimized for Computing-in-Memory Architectures

Shan Gao, Zhiqiang Wu, Yawen Niu, Xiaotao Li, Qingqing Xu

TL;DR

This work addresses the challenge of real-time, energy-efficient video denoising on edge devices by proposing a hardware-algorithm co-design, CIM-NET, optimized for compute-in-memory (CIM) chips. Central to the approach is CIM-CONV, a pseudo-convolution operator that performs smoothing, upsampling, and downsampling within a single MVM by mapping patches to fully connected transformations, thereby dramatically reducing the number of MVM operations. CIM-NET reconfigures the FastDVDnet framework around CIM-CONV to achieve large receptive fields while maintaining denoising quality, with reported gains of up to 1/77 reduction in MVM operations at stride 8 and PSNR comparable to the baseline (35.11 dB vs 35.56 dB). Ablation studies corroborate the value of CIM-CONV in both smoothing and reconstruction steps, and robustness is demonstrated across varying SNR conditions. This work highlights the importance of hardware-aware neural operator design for enabling efficient edge video processing on CIM hardware.

Abstract

While deep neural network (DNN)-based video denoising has demonstrated significant performance, deploying state-of-the-art models on edge devices remains challenging due to stringent real-time and energy efficiency requirements. Computing-in-Memory (CIM) chips offer a promising solution by integrating computation within memory cells, enabling rapid matrix-vector multiplication (MVM). However, existing DNN models are often designed without considering CIM architectural constraints, thus limiting their acceleration potential during inference. To address this, we propose a hardware-algorithm co-design framework incorporating two innovations: (1) a CIM-Aware Architecture, CIM-NET, optimized for large receptive field operation and CIM's crossbar-based MVM acceleration; and (2) a pseudo-convolutional operator, CIM-CONV, used within CIM-NET to integrate slide-based processing with fully connected transformations for high-quality feature extraction and reconstruction. This framework significantly reduces the number of MVM operations, improving inference speed on CIM chips while maintaining competitive performance. Experimental results indicate that, compared to the conventional lightweight model FastDVDnet, CIM-NET substantially reduces MVM operations with a slight decrease in denoising performance. With a stride value of 8, CIM-NET reduces MVM operations to 1/77th of the original, while maintaining competitive PSNR (35.11 dB vs. 35.56 dB

CIM-NET: A Video Denoising Deep Neural Network Model Optimized for Computing-in-Memory Architectures

TL;DR

This work addresses the challenge of real-time, energy-efficient video denoising on edge devices by proposing a hardware-algorithm co-design, CIM-NET, optimized for compute-in-memory (CIM) chips. Central to the approach is CIM-CONV, a pseudo-convolution operator that performs smoothing, upsampling, and downsampling within a single MVM by mapping patches to fully connected transformations, thereby dramatically reducing the number of MVM operations. CIM-NET reconfigures the FastDVDnet framework around CIM-CONV to achieve large receptive fields while maintaining denoising quality, with reported gains of up to 1/77 reduction in MVM operations at stride 8 and PSNR comparable to the baseline (35.11 dB vs 35.56 dB). Ablation studies corroborate the value of CIM-CONV in both smoothing and reconstruction steps, and robustness is demonstrated across varying SNR conditions. This work highlights the importance of hardware-aware neural operator design for enabling efficient edge video processing on CIM hardware.

Abstract

While deep neural network (DNN)-based video denoising has demonstrated significant performance, deploying state-of-the-art models on edge devices remains challenging due to stringent real-time and energy efficiency requirements. Computing-in-Memory (CIM) chips offer a promising solution by integrating computation within memory cells, enabling rapid matrix-vector multiplication (MVM). However, existing DNN models are often designed without considering CIM architectural constraints, thus limiting their acceleration potential during inference. To address this, we propose a hardware-algorithm co-design framework incorporating two innovations: (1) a CIM-Aware Architecture, CIM-NET, optimized for large receptive field operation and CIM's crossbar-based MVM acceleration; and (2) a pseudo-convolutional operator, CIM-CONV, used within CIM-NET to integrate slide-based processing with fully connected transformations for high-quality feature extraction and reconstruction. This framework significantly reduces the number of MVM operations, improving inference speed on CIM chips while maintaining competitive performance. Experimental results indicate that, compared to the conventional lightweight model FastDVDnet, CIM-NET substantially reduces MVM operations with a slight decrease in denoising performance. With a stride value of 8, CIM-NET reduces MVM operations to 1/77th of the original, while maintaining competitive PSNR (35.11 dB vs. 35.56 dB

Paper Structure

This paper contains 12 sections, 11 figures.

Figures (11)

  • Figure 1: MVM operation schematic of CIM chip. A layer of neural metwork is mapped to the memory sub-arrays. Inputs are loaded in parallel as voltage to activate multiple rows, and column currents are summed up based on Kirchhoff’s Law.
  • Figure 2: Denosing module of FastDVDnet.
  • Figure 3: Architecture used in Baseline o1.
  • Figure 4: Baseline o1 performance varies with stride value.
  • Figure 5: Architecture used in Baseline o2.
  • ...and 6 more figures