Table of Contents
Fetching ...

CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images

Jungho Lee, Suhwan Cho, Taeoh Kim, Ho-Deok Jang, Minhyeok Lee, Geonho Cha, Dongyoon Wee, Dogyoon Lee, Sangyoun Lee

TL;DR

CoCoGaussian introduces Circle of Confusion (CoC)-aware Gaussian Splatting to reconstruct 3D scenes from defocused images. By generating CoC Gaussians from base 3D Gaussians and learning an aperture parameter K and focus plane d_F through a lightweight MLP, the method models defocus more realistically while remaining fast through a rasterization-based 3D Gaussian Splatting backbone. An adaptive CoC generation strategy with a learnable scaling factor beta enhances robustness to depth unreliability, enabling reliable reconstruction around reflective or refractive surfaces. Across Deblur-NeRF and DoF-NeRF benchmarks, CoCoGaussian achieves state-of-the-art perceptual and structural metrics, while offering controllable depth of field and focus during rendering, highlighting its practical impact for real-world defocus-aware view synthesis.

Abstract

3D Gaussian Splatting (3DGS) has attracted significant attention for its high-quality novel view rendering, inspiring research to address real-world challenges. While conventional methods depend on sharp images for accurate scene reconstruction, real-world scenarios are often affected by defocus blur due to finite depth of field, making it essential to account for realistic 3D scene representation. In this study, we propose CoCoGaussian, a Circle of Confusion-aware Gaussian Splatting that enables precise 3D scene representation using only defocused images. CoCoGaussian addresses the challenge of defocus blur by modeling the Circle of Confusion (CoC) through a physically grounded approach based on the principles of photographic defocus. Exploiting 3D Gaussians, we compute the CoC diameter from depth and learnable aperture information, generating multiple Gaussians to precisely capture the CoC shape. Furthermore, we introduce a learnable scaling factor to enhance robustness and provide more flexibility in handling unreliable depth in scenes with reflective or refractive surfaces. Experiments on both synthetic and real-world datasets demonstrate that CoCoGaussian achieves state-of-the-art performance across multiple benchmarks.

CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images

TL;DR

CoCoGaussian introduces Circle of Confusion (CoC)-aware Gaussian Splatting to reconstruct 3D scenes from defocused images. By generating CoC Gaussians from base 3D Gaussians and learning an aperture parameter K and focus plane d_F through a lightweight MLP, the method models defocus more realistically while remaining fast through a rasterization-based 3D Gaussian Splatting backbone. An adaptive CoC generation strategy with a learnable scaling factor beta enhances robustness to depth unreliability, enabling reliable reconstruction around reflective or refractive surfaces. Across Deblur-NeRF and DoF-NeRF benchmarks, CoCoGaussian achieves state-of-the-art perceptual and structural metrics, while offering controllable depth of field and focus during rendering, highlighting its practical impact for real-world defocus-aware view synthesis.

Abstract

3D Gaussian Splatting (3DGS) has attracted significant attention for its high-quality novel view rendering, inspiring research to address real-world challenges. While conventional methods depend on sharp images for accurate scene reconstruction, real-world scenarios are often affected by defocus blur due to finite depth of field, making it essential to account for realistic 3D scene representation. In this study, we propose CoCoGaussian, a Circle of Confusion-aware Gaussian Splatting that enables precise 3D scene representation using only defocused images. CoCoGaussian addresses the challenge of defocus blur by modeling the Circle of Confusion (CoC) through a physically grounded approach based on the principles of photographic defocus. Exploiting 3D Gaussians, we compute the CoC diameter from depth and learnable aperture information, generating multiple Gaussians to precisely capture the CoC shape. Furthermore, we introduce a learnable scaling factor to enhance robustness and provide more flexibility in handling unreliable depth in scenes with reflective or refractive surfaces. Experiments on both synthetic and real-world datasets demonstrate that CoCoGaussian achieves state-of-the-art performance across multiple benchmarks.

Paper Structure

This paper contains 39 sections, 16 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: We take the camera position $\mathbf{x}_{cam}$ and the base Gaussian parameters $\mu_{B}$, $\mathbf{s}_{B}$, and $\mathbf{q}_{B}$ as inputs to the MLP $h_{\theta}$, which produces five outputs in total. (a) We set the depth $d({\mu_{B}})$ as the Euclidean distance between $\mathbf{x}_{cam}$ and $\mu_{B}$. Using the output $K$ from $h_{\theta}$, the $d({\mu_{B}})$, and the learnable focus plane $d_{F}$, we apply \ref{['eq:approx_coc']} to determine the CoC diameter. The diameter, combined with the outputs $\beta$ and $\mathbf{d}$ from $h_{\theta}$, is then used in \ref{['eq:delta_coc']} to compute the offset values for $\mu_{\textrm{CoC}}$. By adding these offsets to $\mu_{B}$, we obtain the $\mathbf{G}_{\textrm{CoC}}$ means. (b) We obtain $\mathbf{s}_{\textrm{CoC}}$ and $\mathbf{q}_{\textrm{CoC}}$ by applying \ref{['eq:scaling', 'eq:quaternion']} to $\delta\mathbf{s}_{\textrm{CoC}}$ and $\delta\mathbf{q}_{\textrm{CoC}}$, the outputs of $h_{\theta}$. (c) Using $\mathbf{\mu}_{\textrm{CoC}}$, $\mathbf{s}_{\textrm{CoC}}$, and $\mathbf{q}_{\textrm{CoC}}$, we get the $\mathbf{G}_{\textrm{CoC}}$. Finally, we rasterize $\mathbf{G}_{\textrm{CoC}}$ along with $\mathbf{G}_{B}$ to produce $(M+1)$ images, and apply a weighted sum to generate the final defocused image.
  • Figure 2: Qualitative comparison on the Deblur-NeRF and DoF-NeRF datasets.
  • Figure 3: Visualization of Aperture parameter and Focus Plane Customization. The top row of images decreases the aperture parameter $K$ from left to right, while the bottom row moves the focus plane $d_{F}$ further from the camera from left to right.
  • Figure 4: (a) CoC sizes based on the position of object relative to the focus plane, and (b) CoC sizes based on the aperture size.
  • Figure 5: The Luminance Difference between Defocused and Sharp Images. All the luminance values are normalized ranging between 0 and 1.
  • ...and 3 more figures