BokehDiff: Neural Lens Blur with One-Step Diffusion
Chengxuan Zhu, Qingnan Fan, Qi Zhang, Jinwei Chen, Huaqi Zhang, Chao Xu, Boxin Shi
TL;DR
BokehDiff tackles the challenge of rendering photorealistic lens blur when depth priors are imperfect by marrying physics-inspired constraints with diffusion priors in a one-step diffusion framework. The method introduces a physics-inspired self-attention (PISA) to enforce energy conservation, circle-of-confusion limits, and self-occlusion, and leverages a scalable data synthesis pipeline that uses diffusion-generated foregrounds with transparency to create paired training data. Key contributions include the one-step inference scheme, the PISA module, and a diffusion-based data synthesis approach yielding robust performance across depth discontinuities and real-world scenes. The approach is shown to outperform prior bokeh methods on real and synthetic datasets, offering a practical, efficient, and高-fidelity solution for neural lens blur rendering with broad potential impact on computational photography and mobile imaging systems.
Abstract
We introduce BokehDiff, a novel lens blur rendering method that achieves physically accurate and visually appealing outcomes, with the help of generative diffusion prior. Previous methods are bounded by the accuracy of depth estimation, generating artifacts in depth discontinuities. Our method employs a physics-inspired self-attention module that aligns with the image formation process, incorporating depth-dependent circle of confusion constraint and self-occlusion effects. We adapt the diffusion model to the one-step inference scheme without introducing additional noise, and achieve results of high quality and fidelity. To address the lack of scalable paired data, we propose to synthesize photorealistic foregrounds with transparency with diffusion models, balancing authenticity and scene diversity.
