PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging
Xin Cai, Zhiyuan You, Hailong Zhang, Wentao Liu, Jinwei Gu, Tianfan Xue
TL;DR
PhoCoLens tackles the challenge of reconstructing photorealistic and measurement-consistent images from lensless cameras by framing the forward model as $\mathbf{y} = \mathbf{A} \mathbf{x} + \mathbf{n}$ and applying a range-null space decomposition $\mathbf{x} = \mathbf{A}^{\dagger} \mathbf{A} \mathbf{x} + (\mathbf{I} - \mathbf{A}^{\dagger} \mathbf{A}) \mathbf{x}$. It introduces a two-stage pipeline: first, a spatially varying deconvolution (SVDeconv) recovers the range-space content to enforce data fidelity across a spatially varying PSF; second, a conditional diffusion model (null-space diffusion) uses the recovered low-frequency content as a condition to generate high-frequency details while preserving consistency with the measurements. The approach, validated on PhlatCam and DiffuserCam, achieves a favorable balance between fidelity and photorealism, outperforming traditional and diffusion-augmented baselines in both full-reference and perceptual metrics. This work advances practical lensless imaging by enabling photorealistic, measurement-consistent reconstructions, with potential impact on ultra-compact cameras and real-world imaging systems, albeit with current limitations in real-time applicability due to two-stage processing and diffusion sampling time.
Abstract
Lensless cameras offer significant advantages in size, weight, and cost compared to traditional lens-based systems. Without a focusing lens, lensless cameras rely on computational algorithms to recover the scenes from multiplexed measurements. However, current algorithms struggle with inaccurate forward imaging models and insufficient priors to reconstruct high-quality images. To overcome these limitations, we introduce a novel two-stage approach for consistent and photorealistic lensless image reconstruction. The first stage of our approach ensures data consistency by focusing on accurately reconstructing the low-frequency content with a spatially varying deconvolution method that adjusts to changes in the Point Spread Function (PSF) across the camera's field of view. The second stage enhances photorealism by incorporating a generative prior from pre-trained diffusion models. By conditioning on the low-frequency content retrieved in the first stage, the diffusion model effectively reconstructs the high-frequency details that are typically lost in the lensless imaging process, while also maintaining image fidelity. Our method achieves a superior balance between data fidelity and visual quality compared to existing methods, as demonstrated with two popular lensless systems, PhlatCam and DiffuserCam. Project website: https://phocolens.github.io/.
