Table of Contents
Fetching ...

Forget Superresolution, Sample Adaptively (when Path Tracing)

Martin Bálint, Corentin Salaün, Hans-Peter Seidel, Karol Myszkowski

TL;DR

This work tackles real-time path tracing under ultra-sparse budgets by proposing an end-to-end adaptive sampling and denoising pipeline optimized for sub-1-spp rendering. It introduces a stochastic rounding scheme to backpropagate through discrete sample decisions, a tonemapper-aware training regime with differentiable filmic tonemapping and perceptual MILO loss, and a gather-based pyramidal denoiser with learned per-pixel demodulation. The method demonstrates robust improvements over uniform sampling and state-of-the-art super-resolution and adaptive baselines across multiple budgets and resolutions, preserving perceptually critical details such as shadows and specular highlights. Practically, the approach opens the door to higher-quality, perception-driven rendering in real-time pipelines by exploiting targeted sample placement and perceptual loss guidance, with reasonable inference overhead and strong robustness to extreme sparsity.

Abstract

Real-time path tracing increasingly operates under extremely low sampling budgets, often below one sample per pixel, as rendering complexity, resolution, and frame-rate requirements continue to rise. While super-resolution is widely used in production, it uniformly sacrifices spatial detail and cannot exploit variations in noise, reconstruction difficulty, and perceptual importance across the image. Adaptive sampling offers a compelling alternative, but existing end-to-end approaches rely on approximations that break down in sparse regimes. We introduce an end-to-end adaptive sampling and denoising pipeline explicitly designed for the sub-1-spp regime. Our method uses a stochastic formulation of sample placement that enables gradient estimation despite discrete sampling decisions, allowing stable training of a neural sampler at low sampling budgets. To better align optimization with human perception, we propose a tonemapping-aware training pipeline that integrates differentiable filmic operators and a state-of-the-art perceptual loss, preventing oversampling of regions with low visual impact. In addition, we introduce a gather-based pyramidal denoising filter and a learnable generalization of albedo demodulation tailored to sparse sampling. Our results show consistent improvements over uniform sparse sampling, with notably better reconstruction of perceptually critical details such as specular highlights and shadow boundaries, and demonstrate that adaptive sampling remains effective even at minimal budgets.

Forget Superresolution, Sample Adaptively (when Path Tracing)

TL;DR

This work tackles real-time path tracing under ultra-sparse budgets by proposing an end-to-end adaptive sampling and denoising pipeline optimized for sub-1-spp rendering. It introduces a stochastic rounding scheme to backpropagate through discrete sample decisions, a tonemapper-aware training regime with differentiable filmic tonemapping and perceptual MILO loss, and a gather-based pyramidal denoiser with learned per-pixel demodulation. The method demonstrates robust improvements over uniform sampling and state-of-the-art super-resolution and adaptive baselines across multiple budgets and resolutions, preserving perceptually critical details such as shadows and specular highlights. Practically, the approach opens the door to higher-quality, perception-driven rendering in real-time pipelines by exploiting targeted sample placement and perceptual loss guidance, with reasonable inference overhead and strong robustness to extreme sparsity.

Abstract

Real-time path tracing increasingly operates under extremely low sampling budgets, often below one sample per pixel, as rendering complexity, resolution, and frame-rate requirements continue to rise. While super-resolution is widely used in production, it uniformly sacrifices spatial detail and cannot exploit variations in noise, reconstruction difficulty, and perceptual importance across the image. Adaptive sampling offers a compelling alternative, but existing end-to-end approaches rely on approximations that break down in sparse regimes. We introduce an end-to-end adaptive sampling and denoising pipeline explicitly designed for the sub-1-spp regime. Our method uses a stochastic formulation of sample placement that enables gradient estimation despite discrete sampling decisions, allowing stable training of a neural sampler at low sampling budgets. To better align optimization with human perception, we propose a tonemapping-aware training pipeline that integrates differentiable filmic operators and a state-of-the-art perceptual loss, preventing oversampling of regions with low visual impact. In addition, we introduce a gather-based pyramidal denoising filter and a learnable generalization of albedo demodulation tailored to sparse sampling. Our results show consistent improvements over uniform sparse sampling, with notably better reconstruction of perceptually critical details such as specular highlights and shadow boundaries, and demonstrate that adaptive sampling remains effective even at minimal budgets.
Paper Structure (46 sections, 20 equations, 10 figures, 8 tables)

This paper contains 46 sections, 20 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: Overview of our pipeline. Given GBuffer features and temporal data from the previous frame, the sampler network predicts a continuous sample density map. After stochastic discretization (\ref{['sec:sampling']}), the renderer traces the requested samples, producing a sparse, noisy image. The denoiser network, which shares latent features with the sampler, predicts weights for our pyramidal gather filter (\ref{['sec:implementation']}). The denoised HDR output is tonemapped for display and fed back to the sampler in the next frame, closing our perceptual feedback loop (\ref{['sec:tonemapper']}).
  • Figure 2: Visualization of our relaxed estimator. On the left, we illustrate the effect of the sampling probability $p$ on the slope of the function at a given temperature parameter $\lambda$. On the right, we compare the estimator at a given value of $p$ for different values of $\lambda$, along with the Gumbel–Softmax estimator for reference.
  • Figure 3: Gradient comparison. We compare the straight-through estimator and our proposed relaxed estimator to the ground truth gradient estimated via finite differences. Our estimator is closest to the finite-difference reference, leading to more stable optimization due to lower gradient bias.
  • Figure 4: Visualization of our differentiable tone-mapping operator (TMO) for three parameter settings. The parameters $s$ and $h$ control the compression of the lower (toe) and upper (shoulder) luminance tails, respectively. The curves are plotted as a function of the logarithm of radiance.
  • Figure 5: Gathering versus scattering kernels. Gathering kernels (a) weigh the neighbouring sparse samples for every output pixel. Sparsity is distributed within the kernels, ensuring an even distribution of difficulty for the network's prediction task. Scattering kernels (b) scatter the sparse samples to neighbouring output pixels. Sparsity is distributed across the kernels; kernels at sampled pixels are dense and solely responsible for the output image, while the majority of kernels at non-sampled pixels have no effect.
  • ...and 5 more figures