VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis
Angtian Wang, Peng Wang, Jian Sun, Adam Kortylewski, Alan Yuille
TL;DR
VoGE tackles differentiable rendering for explicit geometric representations by using Gaussian ellipsoids as 3D primitives and rendering via ray-traced volume densities. It introduces an efficient approximate closed-form for density aggregation and a coarse-to-fine rendering pipeline, yielding real-time performance with a CUDA implementation. Key contributions include the Gaussian-ellipsoid reconstruction kernel, a differentiable rendering pipeline that naturally handles occlusions, and integration as a neural network module for sampling and rendering. Empirically, VoGE outperforms state-of-the-art differentiable renderers on in-the-wild pose estimation and texture-related tasks, while maintaining competitive rendering speed and providing improved gradient signals for occlusion reasoning and inverse rendering.
Abstract
The Gaussian reconstruction kernels have been proposed by Westover (1990) and studied by the computer graphics community back in the 90s, which gives an alternative representation of object 3D geometry from meshes and point clouds. On the other hand, current state-of-the-art (SoTA) differentiable renderers, Liu et al. (2019), use rasterization to collect triangles or points on each image pixel and blend them based on the viewing distance. In this paper, we propose VoGE, which utilizes the volumetric Gaussian reconstruction kernels as geometric primitives. The VoGE rendering pipeline uses ray tracing to capture the nearest primitives and blends them as mixtures based on their volume density distributions along the rays. To efficiently render via VoGE, we propose an approximate closeform solution for the volume density aggregation and a coarse-to-fine rendering strategy. Finally, we provide a CUDA implementation of VoGE, which enables real-time level rendering with a competitive rendering speed in comparison to PyTorch3D. Quantitative and qualitative experiment results show VoGE outperforms SoTA counterparts when applied to various vision tasks, e.g., object pose estimation, shape/texture fitting, and occlusion reasoning. The VoGE library and demos are available at: https://github.com/Angtian/VoGE.
