Zero Grads: Learning Local Surrogate Losses for Non-Differentiable Graphics
Michael Fischer, Tobias Ritschel
TL;DR
This work tackles the challenge of performing gradient-based optimization on non-differentiable, black-box graphics pipelines by learning a local differentiable surrogate. The approach, ZeroGrads, smooths the forward objective with a Gaussian kernel, fits a local neural or polynomial surrogate h(θ, φ), and employs a low-variance, locality-aware estimator to update both the surrogate parameters φ and the decision variables θ online. Key contributions include a fully online, self-supervised surrogate learning framework, an efficient sampling strategy that reduces gradient variance, and demonstrations showing scalability to high-dimensional settings (up to tens of thousands of variables) across rendering, procedural modeling, and animation tasks. The method broadens the applicability of gradient-based optimization in graphics, offering a general, scalable toolkit that complements specialized differentiable renderers and derivative-free optimizers alike.
Abstract
Gradient-based optimization is now ubiquitous across graphics, but unfortunately can not be applied to problems with undefined or zero gradients. To circumvent this issue, the loss function can be manually replaced by a ``surrogate'' that has similar minima but is differentiable. Our proposed framework, ZeroGrads, automates this process by learning a neural approximation of the objective function, which in turn can be used to differentiate through arbitrary black-box graphics pipelines. We train the surrogate on an actively smoothed version of the objective and encourage locality, focusing the surrogate's capacity on what matters at the current training episode. The fitting is performed online, alongside the parameter optimization, and self-supervised, without pre-computed data or pre-trained models. As sampling the objective is expensive (it requires a full rendering or simulator run), we devise an efficient sampling scheme that allows for tractable run-times and competitive performance at little overhead. We demonstrate optimizing diverse non-convex, non-differentiable black-box problems in graphics, such as visibility in rendering, discrete parameter spaces in procedural modelling or optimal control in physics-driven animation. In contrast to other derivative-free algorithms, our approach scales well to higher dimensions, which we demonstrate on problems with up to 35k interlinked variables.
