Table of Contents
Fetching ...

Learnable Fractional Reaction-Diffusion Dynamics for Under-Display ToF Imaging and Beyond

Xin Qiao, Matteo Poggi, Xing Wei, Pengchao Deng, Yanhui Zhou, Stefano Mattoccia

TL;DR

UD-ToF imaging under TOLED displays suffers from signal attenuation, MPI, and temporal noise. The authors marry neural networks with a time-fractional reaction-diffusion model and a continuous convolution operator in $LFRD^2$, enabling learnable fractional orders and memory-aware depth refinement. The approach achieves state-of-the-art results on UD-ToF benchmarks and improves non-UD tasks like ToF denoising and depth super-resolution, with ablations validating the contributions of fractional dynamics and continuous convolution. This physics-informed, interpretable diffusion framework offers a practical, efficient path to high-quality depth in challenging display-integrated imaging scenarios.

Abstract

Under-display ToF imaging aims to achieve accurate depth sensing through a ToF camera placed beneath a screen panel. However, transparent OLED (TOLED) layers introduce severe degradations-such as signal attenuation, multi-path interference (MPI), and temporal noise-that significantly compromise depth quality. To alleviate this drawback, we propose Learnable Fractional Reaction-Diffusion Dynamics (LFRD2), a hybrid framework that combines the expressive power of neural networks with the interpretability of physical modeling. Specifically, we implement a time-fractional reaction-diffusion module that enables iterative depth refinement with dynamically generated differential orders, capturing long-term dependencies. In addition, we introduce an efficient continuous convolution operator via coefficient prediction and repeated differentiation to further improve restoration quality. Experiments on four benchmark datasets demonstrate the effectiveness of our approach. The code is publicly available at https://github.com/wudiqx106/LFRD2.

Learnable Fractional Reaction-Diffusion Dynamics for Under-Display ToF Imaging and Beyond

TL;DR

UD-ToF imaging under TOLED displays suffers from signal attenuation, MPI, and temporal noise. The authors marry neural networks with a time-fractional reaction-diffusion model and a continuous convolution operator in , enabling learnable fractional orders and memory-aware depth refinement. The approach achieves state-of-the-art results on UD-ToF benchmarks and improves non-UD tasks like ToF denoising and depth super-resolution, with ablations validating the contributions of fractional dynamics and continuous convolution. This physics-informed, interpretable diffusion framework offers a practical, efficient path to high-quality depth in challenging display-integrated imaging scenarios.

Abstract

Under-display ToF imaging aims to achieve accurate depth sensing through a ToF camera placed beneath a screen panel. However, transparent OLED (TOLED) layers introduce severe degradations-such as signal attenuation, multi-path interference (MPI), and temporal noise-that significantly compromise depth quality. To alleviate this drawback, we propose Learnable Fractional Reaction-Diffusion Dynamics (LFRD2), a hybrid framework that combines the expressive power of neural networks with the interpretability of physical modeling. Specifically, we implement a time-fractional reaction-diffusion module that enables iterative depth refinement with dynamically generated differential orders, capturing long-term dependencies. In addition, we introduce an efficient continuous convolution operator via coefficient prediction and repeated differentiation to further improve restoration quality. Experiments on four benchmark datasets demonstrate the effectiveness of our approach. The code is publicly available at https://github.com/wudiqx106/LFRD2.

Paper Structure

This paper contains 13 sections, 9 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Core components in LFRD$^2$. On top: comparison between (a) Integer Differential Equation (IDE), (b) Fractional Differential Equation (FDE); At bottom: comparison between (c) Discrete Convolution and (d) Continuous Convolution, followed by some implementations of the latter, i.e., (e) Neural Field Convolution, and (f) ours.
  • Figure 2: Overview of LFRD$^2$. Our framework deploys a Deep Initial State Builder, which can be any among the existing networks for UD-ToF imaging, to obtain an initial depth map. Then, the Deep Fractional Reaction-Diffusion module iteratively optimizes it to obtain the final, high-quality (HQ) depth map.
  • Figure 3: Overview of our Continuous Convolution module. Comparison between NFC and our proposal.
  • Figure 4: Qualitative results on the SUD-ToF (top two rows) and RUD-ToF (bottom two rows) dataset. From left to right: (a) IR image and (b) ground-truth depth, followed by (c-g) error maps achieved by SoTA solutions and (h) LFRD$^2$, (i) depth maps by LFRD$^2$.
  • Figure 5: Qualitative results of different iterations. IR represents the IR image, and $E_i$ denotes the error map after $i$-th iteration.