Table of Contents
Fetching ...

Transforming a Non-Differentiable Rasterizer into a Differentiable One with Stochastic Gradient Estimation

Thomas Deliot, Eric Heitz, Laurent Belcour

TL;DR

The paper tackles making rasterization differentiable to enable gradient-based in-engine optimization of complex 3D assets. It introduces a per-pixel stochastic gradient estimation method that computes two perturbed rasterizations and accumulates gradients only for pixels and primitives that influence each parameter, using two compute shaders (P and G) and ID/UV buffers to track contributions. The approach is dependency-free, cross-platform, and scalable to upwards of 1 million parameters, achieving competitive qualitative performance with state-of-the-art differentiable rasterizers like nvDiffRast while maintaining an in-engine workflow. Empirical results validate the per-pixel formulation, demonstrate broad applicability (meshes, textures, volumes, subdivision surfaces, Gaussian splats), and provide a detailed performance breakdown illustrating practical step timings. Overall, the method lowers barriers to differentiable rendering in game engines and similar pipelines by delivering an accessible, in-engine optimizer for large-scale asset parameter optimization.

Abstract

We show how to transform a non-differentiable rasterizer into a differentiable one with minimal engineering efforts and no external dependencies (no Pytorch/Tensorflow). We rely on Stochastic Gradient Estimation, a technique that consists of rasterizing after randomly perturbing the scene's parameters such that their gradient can be stochastically estimated and descended. This method is simple and robust but does not scale in dimensionality (number of scene parameters). Our insight is that the number of parameters contributing to a given rasterized pixel is bounded. Estimating and averaging gradients on a per-pixel basis hence bounds the dimensionality of the underlying optimization problem and makes the method scalable. Furthermore, it is simple to track per-pixel contributing parameters by rasterizing ID- and UV-buffers, which are trivial additions to a rasterization engine if not already available. With these minor modifications, we obtain an in-engine optimizer for 3D assets with millions of geometry and texture parameters.

Transforming a Non-Differentiable Rasterizer into a Differentiable One with Stochastic Gradient Estimation

TL;DR

The paper tackles making rasterization differentiable to enable gradient-based in-engine optimization of complex 3D assets. It introduces a per-pixel stochastic gradient estimation method that computes two perturbed rasterizations and accumulates gradients only for pixels and primitives that influence each parameter, using two compute shaders (P and G) and ID/UV buffers to track contributions. The approach is dependency-free, cross-platform, and scalable to upwards of 1 million parameters, achieving competitive qualitative performance with state-of-the-art differentiable rasterizers like nvDiffRast while maintaining an in-engine workflow. Empirical results validate the per-pixel formulation, demonstrate broad applicability (meshes, textures, volumes, subdivision surfaces, Gaussian splats), and provide a detailed performance breakdown illustrating practical step timings. Overall, the method lowers barriers to differentiable rendering in game engines and similar pipelines by delivering an accessible, in-engine optimizer for large-scale asset parameter optimization.

Abstract

We show how to transform a non-differentiable rasterizer into a differentiable one with minimal engineering efforts and no external dependencies (no Pytorch/Tensorflow). We rely on Stochastic Gradient Estimation, a technique that consists of rasterizing after randomly perturbing the scene's parameters such that their gradient can be stochastically estimated and descended. This method is simple and robust but does not scale in dimensionality (number of scene parameters). Our insight is that the number of parameters contributing to a given rasterized pixel is bounded. Estimating and averaging gradients on a per-pixel basis hence bounds the dimensionality of the underlying optimization problem and makes the method scalable. Furthermore, it is simple to track per-pixel contributing parameters by rasterizing ID- and UV-buffers, which are trivial additions to a rasterization engine if not already available. With these minor modifications, we obtain an in-engine optimizer for 3D assets with millions of geometry and texture parameters.
Paper Structure (28 sections, 9 equations, 7 figures, 3 tables, 2 algorithms)

This paper contains 28 sections, 9 equations, 7 figures, 3 tables, 2 algorithms.

Figures (7)

  • Figure 1: Overview of our differentiable rasterizer. The first compute shader (P) perturbs the scene parameters before they are rasterized (${\mathcal{R}}$). The second compute shader (G) accumulates the error differences, which provide a gradient estimate. The key point of our approach is that it accumulates the contribution of a pixel (in red in the images) only in its contributing parameters (in red in the vectors).
  • Figure 2: Validation of the per-pixel formulation. In this experiment, we optimize triangles soups to match a 2D image. The full-image variant implements Equation (\ref{['eq:stochastic_finite_difference']}) where the error over the whole image contributes to every parameter and the per-pixel approach implements Equation (\ref{['eq:stochastic_finite_difference_pixel']}). The timings are provided for an NVIDIA 4090 GPU.
  • Figure 3: Qualitative comparison against nvDiffRast: optimizing a mesh with an albedo map. We optimize a mesh with 3072 triangles (1748 vertices) and a $1024^2$ albedo texture. The timings are provided for an NVIDIA 4090 GPU.
  • Figure 4: Qualitative comparison against nvDiffRast: optimizing a mesh with a normal map. We optimize a mesh with 3072 triangles (1748 vertices) and a $512^2$ normalmap texture. The timings are provided for an NVIDIA 4090 GPU.
  • Figure 5: Optimizing a subdivision surface with displacement and normal textures. We optimize a control mesh of 1K vertices that controls the tessellation of a Catmull-Clark subdivision surface. The surface has 24K triangles after two levels of subdivision, which are further displaced and normal mapped with $1024^2$ textures. The timings are provided for an NVIDIA 4090 GPU.
  • ...and 2 more figures