Table of Contents
Fetching ...

Decoupling Degradations with Recurrent Network for Video Restoration in Under-Display Camera

Chengxu Liu, Xuan Wang, Yuanting Fan, Shuai Li, Xueming Qian

TL;DR

This paper tackles the challenge of restoring videos captured by under-display cameras, where diffraction-induced degradations vary with incident light and over time. It introduces D$^2$RNet, a multi-scale, bi-directional recurrent network augmented with Decoupling Attention Modules (DAM) that separate flare (from strong glare) and haze (from diffusion) and restore them using long- and short-term features, respectively. A soft-mask generator and supervised intermediate outputs enable targeted learning for each degradation path, and the model is extended to three scales to handle scale-changing degradations in long-range videos. The authors also present VidUDC33K, a large-scale UDC video benchmark with HDR content and realistic PSF-based degradation, and demonstrate that D$^2$RNet achieves state-of-the-art performance (notably surpassing RVRT by about 1.02 dB PSNR) on this dataset, with robust results on real UDC videos. Code and dataset support facilitate reproducibility and broader adoption in UDC video restoration tasks.

Abstract

Under-display camera (UDC) systems are the foundation of full-screen display devices in which the lens mounts under the display. The pixel array of light-emitting diodes used for display diffracts and attenuates incident light, causing various degradations as the light intensity changes. Unlike general video restoration which recovers video by treating different degradation factors equally, video restoration for UDC systems is more challenging that concerns removing diverse degradation over time while preserving temporal consistency. In this paper, we introduce a novel video restoration network, called D$^2$RNet, specifically designed for UDC systems. It employs a set of Decoupling Attention Modules (DAM) that effectively separate the various video degradation factors. More specifically, a soft mask generation function is proposed to formulate each frame into flare and haze based on the diffraction arising from incident light of different intensities, followed by the proposed flare and haze removal components that leverage long- and short-term feature learning to handle the respective degradations. Such a design offers an targeted and effective solution to eliminating various types of degradation in UDC systems. We further extend our design into multi-scale to overcome the scale-changing of degradation that often occur in long-range videos. To demonstrate the superiority of D$^2$RNet, we propose a large-scale UDC video benchmark by gathering HDR videos and generating realistically degraded videos using the point spread function measured by a commercial UDC system. Extensive quantitative and qualitative evaluations demonstrate the superiority of D$^2$RNet compared to other state-of-the-art video restoration and UDC image restoration methods. Code is available at https://github.com/ChengxuLiu/DDRNet.git

Decoupling Degradations with Recurrent Network for Video Restoration in Under-Display Camera

TL;DR

This paper tackles the challenge of restoring videos captured by under-display cameras, where diffraction-induced degradations vary with incident light and over time. It introduces DRNet, a multi-scale, bi-directional recurrent network augmented with Decoupling Attention Modules (DAM) that separate flare (from strong glare) and haze (from diffusion) and restore them using long- and short-term features, respectively. A soft-mask generator and supervised intermediate outputs enable targeted learning for each degradation path, and the model is extended to three scales to handle scale-changing degradations in long-range videos. The authors also present VidUDC33K, a large-scale UDC video benchmark with HDR content and realistic PSF-based degradation, and demonstrate that DRNet achieves state-of-the-art performance (notably surpassing RVRT by about 1.02 dB PSNR) on this dataset, with robust results on real UDC videos. Code and dataset support facilitate reproducibility and broader adoption in UDC video restoration tasks.

Abstract

Under-display camera (UDC) systems are the foundation of full-screen display devices in which the lens mounts under the display. The pixel array of light-emitting diodes used for display diffracts and attenuates incident light, causing various degradations as the light intensity changes. Unlike general video restoration which recovers video by treating different degradation factors equally, video restoration for UDC systems is more challenging that concerns removing diverse degradation over time while preserving temporal consistency. In this paper, we introduce a novel video restoration network, called DRNet, specifically designed for UDC systems. It employs a set of Decoupling Attention Modules (DAM) that effectively separate the various video degradation factors. More specifically, a soft mask generation function is proposed to formulate each frame into flare and haze based on the diffraction arising from incident light of different intensities, followed by the proposed flare and haze removal components that leverage long- and short-term feature learning to handle the respective degradations. Such a design offers an targeted and effective solution to eliminating various types of degradation in UDC systems. We further extend our design into multi-scale to overcome the scale-changing of degradation that often occur in long-range videos. To demonstrate the superiority of DRNet, we propose a large-scale UDC video benchmark by gathering HDR videos and generating realistically degraded videos using the point spread function measured by a commercial UDC system. Extensive quantitative and qualitative evaluations demonstrate the superiority of DRNet compared to other state-of-the-art video restoration and UDC image restoration methods. Code is available at https://github.com/ChengxuLiu/DDRNet.git
Paper Structure (24 sections, 10 equations, 7 figures, 4 tables)

This paper contains 24 sections, 10 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Method illustration. In UDC systems, the degree of degradation is positively correlated with the intensity of incident light. Our method decouples the degradation into brighter flare and darker haze, which are handled using information from long and short distances, respectively.
  • Figure 2: (a) illustrates the formation of the PSF in UDC systems. The light emitted from the light source crosses a display and a lens before it is finally captured by the sensor. (b) is the generation of UDC video, where the matching part computes the Homography matrix (i.e., $H$) corresponding to inter-frame motion, and the transform part performs perspective warp on PSF.
  • Figure 3: Overview of D$^2$RNet. It adopts a multi-scale bilateral recurrent architecture. Where an encoder and decoder are used to extract frame features and reconstruct the output frame, respectively. The proposed decoupling attention modules (DAM, see details in Fig. \ref{['fig:dam']}) is used to refine the features in both backward and forward propagation, which is supervised at multi-scale.
  • Figure 4: Structure of the Decoupling Attention Module (DAM). From top to bottom, it mainly consists of a soft mask generation function $\varphi(\cdot)$ for decoupling the flare $M^{flare}_t$ and haze $M^{haze}_t$, and flare and haze removal components handle the respective degradations using long- and short-term features, respectively.
  • Figure 5: Visual results on proposed VidUDC33K. The method is shown on the bottom. Zoom in to see better visualization.
  • ...and 2 more figures