Table of Contents
Fetching ...

Pulsar: Efficient Sphere-based Neural Rendering

Christoph Lassner, Michael Zollhöfer

TL;DR

Pulsar introduces a fast, sphere-based differentiable renderer tightly integrated with PyTorch, enabling end-to-end optimization of geometry and appearance from image observations. By representing scenes with millions of spheres and employing a depth-weighted soft blending and a data-parallel CUDA pipeline, it achieves real-time forward and backward passes that scale to large numbers of primitives. The framework supports neural shading and is demonstrated across 3D reconstruction, novel-view synthesis, and view-dependent rendering, delivering substantial speedups over prior differentiable renderers. Its open-source, modular design and strong performance on consumer GPUs make large-scale neural rendering tasks more practical and accessible.

Abstract

We propose Pulsar, an efficient sphere-based differentiable renderer that is orders of magnitude faster than competing techniques, modular, and easy-to-use due to its tight integration with PyTorch. Differentiable rendering is the foundation for modern neural rendering approaches, since it enables end-to-end training of 3D scene representations from image observations. However, gradient-based optimization of neural mesh, voxel, or function representations suffers from multiple challenges, i.e., topological inconsistencies, high memory footprints, or slow rendering speeds. To alleviate these problems, Pulsar employs: 1) a sphere-based scene representation, 2) an efficient differentiable rendering engine, and 3) neural shading. Pulsar executes orders of magnitude faster than existing techniques and allows real-time rendering and optimization of representations with millions of spheres. Using spheres for the scene representation, unprecedented speed is obtained while avoiding topology problems. Pulsar is fully differentiable and thus enables a plethora of applications, ranging from 3D reconstruction to general neural rendering.

Pulsar: Efficient Sphere-based Neural Rendering

TL;DR

Pulsar introduces a fast, sphere-based differentiable renderer tightly integrated with PyTorch, enabling end-to-end optimization of geometry and appearance from image observations. By representing scenes with millions of spheres and employing a depth-weighted soft blending and a data-parallel CUDA pipeline, it achieves real-time forward and backward passes that scale to large numbers of primitives. The framework supports neural shading and is demonstrated across 3D reconstruction, novel-view synthesis, and view-dependent rendering, delivering substantial speedups over prior differentiable renderers. Its open-source, modular design and strong performance on consumer GPUs make large-scale neural rendering tasks more practical and accessible.

Abstract

We propose Pulsar, an efficient sphere-based differentiable renderer that is orders of magnitude faster than competing techniques, modular, and easy-to-use due to its tight integration with PyTorch. Differentiable rendering is the foundation for modern neural rendering approaches, since it enables end-to-end training of 3D scene representations from image observations. However, gradient-based optimization of neural mesh, voxel, or function representations suffers from multiple challenges, i.e., topological inconsistencies, high memory footprints, or slow rendering speeds. To alleviate these problems, Pulsar employs: 1) a sphere-based scene representation, 2) an efficient differentiable rendering engine, and 3) neural shading. Pulsar executes orders of magnitude faster than existing techniques and allows real-time rendering and optimization of representations with millions of spheres. Using spheres for the scene representation, unprecedented speed is obtained while avoiding topology problems. Pulsar is fully differentiable and thus enables a plethora of applications, ranging from 3D reconstruction to general neural rendering.

Paper Structure

This paper contains 36 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Pulsar is an efficient sphere-based differentiable renderer that is orders of magnitude faster than competing techniques, modular, and easy-to-use. It can be employed to solve a large variety of applications, since it is tightly integrated with PyTorch. Using a sphere-based representation, it is possible to not only optimize for color and opacity, but also for positions and radii (a, b, c). Due to the modular design, lighting cues can also be easily integrated (d).
  • Figure 2: Visualization of the neural rendering pipeline. Pulsar enables a particularly fast differentiable projection step that scales to complex scene representations. The scene representation itself can be produced by a neural network. The channel information can be 'latent' and translated to RGB colors in a 'neural shading' step.
  • Figure 3: Scaling behavior for PyTorch3D and Pulsar for different numbers of spheres. Whereas PyTorch3D scales almost linearly in terms of number of spheres, Pulsar employs early-stopping and other optimization techniques to reach much better scaling behavior. Benchmarks performed on an NVIDIA RTX 2070 GPU at $1024\times 1024$ resolution.
  • Figure 4: 3D reconstruction with Pulsar with up to 400.0k spheres. (a) Silhouette-based deformation reconstruction (c.t. liu2019soft); 1352.0 spheres, $64\times 64$. (b) Reconstruction with lighting cues and comparison with DSS yifan2019differentiable; 8003.0 spheres, $256\times 256$. Pulsar finishes the reconstruction after 31s, whereas DSS finishes after 1168s. (c) Reconstruction steps of a 3D head model in 73s; 400.0k spheres, $800\times 1280$. 80 images with random azimuth and elevation are used. (d) Initialized features for training the neural shading model for (e). (e) Neural rendering results of a pix2pixHD wang2018pix2pixHD model based on this geometry.
  • Figure 5: High-resolution scene representation view synthesis and reconstruction examples with 1M and more spheres; scenes from the NeRF dataset mildenhall2020nerf. (a) Test view of the 'flower' scene; 2.1M/810K spheres; $1008\times 756$. (b) Test view of the 'fern' scene; 2.6M spheres; $1008\times 756$. (c) Test view of the 'chair' scene with two different virtual viewpoints and shared per-pixel fully-connected shading model; 5.5M/509K spheres, $1600\times 1600$. Note the viewpoint-dependent shading effects on the chair cover. (d) 360 degree views of the 'chair' model. X/Y spheres are before/after optimization.