Table of Contents
Fetching ...

3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt

Lukas Höllein, Aljaž Božič, Michael Zollhöfer, Matthias Nießner

TL;DR

3DGS-LM replaces ADAM with a tailored Levenberg-Marquardt optimizer to speed up 3D Gaussian Splatting reconstruction. It employs a GPU-accelerated PCG-based inner loop with a gradient cache and a per-pixel-per-splat parallelization to solve the normal equations efficiently. The method uses a two-stage pipeline (ADAM densification followed by LM refinement) and achieves about a 20% faster convergence with the same reconstruction quality, while remaining compatible with other 3DGS acceleration techniques. This approach makes dense scene reconstruction with 3DGS more practical for real-world applications.

Abstract

We present 3DGS-LM, a new method that accelerates the reconstruction of 3D Gaussian Splatting (3DGS) by replacing its ADAM optimizer with a tailored Levenberg-Marquardt (LM). Existing methods reduce the optimization time by decreasing the number of Gaussians or by improving the implementation of the differentiable rasterizer. However, they still rely on the ADAM optimizer to fit Gaussian parameters of a scene in thousands of iterations, which can take up to an hour. To this end, we change the optimizer to LM that runs in conjunction with the 3DGS differentiable rasterizer. For efficient GPU parallization, we propose a caching data structure for intermediate gradients that allows us to efficiently calculate Jacobian-vector products in custom CUDA kernels. In every LM iteration, we calculate update directions from multiple image subsets using these kernels and combine them in a weighted mean. Overall, our method is 20% faster than the original 3DGS while obtaining the same reconstruction quality. Our optimization is also agnostic to other methods that acclerate 3DGS, thus enabling even faster speedups compared to vanilla 3DGS.

3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt

TL;DR

3DGS-LM replaces ADAM with a tailored Levenberg-Marquardt optimizer to speed up 3D Gaussian Splatting reconstruction. It employs a GPU-accelerated PCG-based inner loop with a gradient cache and a per-pixel-per-splat parallelization to solve the normal equations efficiently. The method uses a two-stage pipeline (ADAM densification followed by LM refinement) and achieves about a 20% faster convergence with the same reconstruction quality, while remaining compatible with other 3DGS acceleration techniques. This approach makes dense scene reconstruction with 3DGS more practical for real-world applications.

Abstract

We present 3DGS-LM, a new method that accelerates the reconstruction of 3D Gaussian Splatting (3DGS) by replacing its ADAM optimizer with a tailored Levenberg-Marquardt (LM). Existing methods reduce the optimization time by decreasing the number of Gaussians or by improving the implementation of the differentiable rasterizer. However, they still rely on the ADAM optimizer to fit Gaussian parameters of a scene in thousands of iterations, which can take up to an hour. To this end, we change the optimizer to LM that runs in conjunction with the 3DGS differentiable rasterizer. For efficient GPU parallization, we propose a caching data structure for intermediate gradients that allows us to efficiently calculate Jacobian-vector products in custom CUDA kernels. In every LM iteration, we calculate update directions from multiple image subsets using these kernels and combine them in a weighted mean. Overall, our method is 20% faster than the original 3DGS while obtaining the same reconstruction quality. Our optimization is also agnostic to other methods that acclerate 3DGS, thus enabling even faster speedups compared to vanilla 3DGS.
Paper Structure (25 sections, 20 equations, 6 figures, 9 tables, 1 algorithm)

This paper contains 25 sections, 20 equations, 6 figures, 9 tables, 1 algorithm.

Figures (6)

  • Figure 1: Our method accelerates 3D Gaussian Splatting (3DGS) kerbl3Dgaussians reconstruction by replacing the ADAM optimizer with a tailored Levenberg-Marquardt. Left: starting from the same initialization, our method converges faster on the Tanks&Temples TRAIN scene. Right: after the same amount of time, our method produces higher quality renderings (e.g., better brightness and contrast).
  • Figure 2: Method Overview. We accelerate 3DGS optimization by framing it in two stages. First, we use the original ADAM optimizer and densification scheme to arrive at an initialization for all Gaussians. Second, we employ the Levenberg-Marquardt algorithm to finish optimization.
  • Figure 3: Parallelization Strategy And Caching Scheme. We implement the PCG algorithm with efficient CUDA kernels, that use a gradient cache to calculate Jacobian-vector products. Left: before PCG starts, we create the gradient cache following the per-pixel parallelization of 3DGS kerbl3Dgaussians. Afterwards, we sort the cache by Gaussians to ensure coalesced read accesses. Right: the cache decouples splats along rays, which allows us to parallelize per-pixel-per-splat when computing $\mathbf{u} = \mathbf{J} \mathbf{p}$ and $\mathbf{g} = \mathbf{J}^T \mathbf{u}$ during PCG.
  • Figure 4: Comparison of initialization iterations. In our first stage, we initialize the Gaussians with gradient descent for $\text{K}$ iterations, before finetuning with our LM optimizer. After $\text{K} {=} 6000$ or $\text{K} {=} 8000$ iterations, our method converges faster than the baseline. With less iterations, pure LM is slower, which highlights the importance of our two stage approach. Results reported on the GARDEN scene from MipNeRF360 mildenhall2021nerf without densification.
  • Figure 5: Qualitative comparison of our method and baselines. We compare rendered test images after similar optimization time. All baselines converge faster when using our LM optimizer, which shows in images with fewer artifacts and more accurate brightness / contrast.
  • ...and 1 more figures