Table of Contents
Fetching ...

Neural Differential Appearance Equations

Chen Liu, Tobias Ritschel

TL;DR

We address the challenge of reproducing dynamic, time-varying appearance in textures that are spatially stationary but exhibit evolving statistics. The method learns a neural ODE in a latent space to model appearance dynamics, with a warm-up phase that denoises from noise and a generation phase that evolves to match a target exemplar, enabling both RGB textures and relightable svBRDFs via a differentiable renderer. Key contributions include a novel two-phase training scheme, an ODE-based appearance model compatible with differentiable rendering, and new RGB and SVBRDF dynamic texture datasets that support relighting experiments. The approach yields realistic, temporally coherent results and demonstrates competitive to superior performance against strong baselines, with practical implications for games, data generation, and material appearance research.

Abstract

We propose a method to reproduce dynamic appearance textures with space-stationary but time-varying visual statistics. While most previous work decomposes dynamic textures into static appearance and motion, we focus on dynamic appearance that results not from motion but variations of fundamental properties, such as rusting, decaying, melting, and weathering. To this end, we adopt the neural ordinary differential equation (ODE) to learn the underlying dynamics of appearance from a target exemplar. We simulate the ODE in two phases. At the "warm-up" phase, the ODE diffuses a random noise to an initial state. We then constrain the further evolution of this ODE to replicate the evolution of visual feature statistics in the exemplar during the generation phase. The particular innovation of this work is the neural ODE achieving both denoising and evolution for dynamics synthesis, with a proposed temporal training scheme. We study both relightable (BRDF) and non-relightable (RGB) appearance models. For both we introduce new pilot datasets, allowing, for the first time, to study such phenomena: For RGB we provide 22 dynamic textures acquired from free online sources; For BRDFs, we further acquire a dataset of 21 flash-lit videos of time-varying materials, enabled by a simple-to-construct setup. Our experiments show that our method consistently yields realistic and coherent results, whereas prior works falter under pronounced temporal appearance variations. A user study confirms our approach is preferred to previous work for such exemplars.

Neural Differential Appearance Equations

TL;DR

We address the challenge of reproducing dynamic, time-varying appearance in textures that are spatially stationary but exhibit evolving statistics. The method learns a neural ODE in a latent space to model appearance dynamics, with a warm-up phase that denoises from noise and a generation phase that evolves to match a target exemplar, enabling both RGB textures and relightable svBRDFs via a differentiable renderer. Key contributions include a novel two-phase training scheme, an ODE-based appearance model compatible with differentiable rendering, and new RGB and SVBRDF dynamic texture datasets that support relighting experiments. The approach yields realistic, temporally coherent results and demonstrates competitive to superior performance against strong baselines, with practical implications for games, data generation, and material appearance research.

Abstract

We propose a method to reproduce dynamic appearance textures with space-stationary but time-varying visual statistics. While most previous work decomposes dynamic textures into static appearance and motion, we focus on dynamic appearance that results not from motion but variations of fundamental properties, such as rusting, decaying, melting, and weathering. To this end, we adopt the neural ordinary differential equation (ODE) to learn the underlying dynamics of appearance from a target exemplar. We simulate the ODE in two phases. At the "warm-up" phase, the ODE diffuses a random noise to an initial state. We then constrain the further evolution of this ODE to replicate the evolution of visual feature statistics in the exemplar during the generation phase. The particular innovation of this work is the neural ODE achieving both denoising and evolution for dynamics synthesis, with a proposed temporal training scheme. We study both relightable (BRDF) and non-relightable (RGB) appearance models. For both we introduce new pilot datasets, allowing, for the first time, to study such phenomena: For RGB we provide 22 dynamic textures acquired from free online sources; For BRDFs, we further acquire a dataset of 21 flash-lit videos of time-varying materials, enabled by a simple-to-construct setup. Our experiments show that our method consistently yields realistic and coherent results, whereas prior works falter under pronounced temporal appearance variations. A user study confirms our approach is preferred to previous work for such exemplars.

Paper Structure

This paper contains 59 sections, 9 equations, 12 figures, 3 tables, 1 algorithm.

Figures (12)

  • Figure 1: Overview of our approach. Input is a time series of images shown in the first row. Output is a model that enables sampling new texture instances shown by the pink curve in the bottom row. This is achieved by defining an ODE in a higher-dimensional latent space, shown as the blue curve. We learn an ODE vector field (orange arrows) guiding the update of the latent state. We start from noise, and after a warm-up phase (no exemplar supervision, shown in the first row), the new instance evolves. We project the latent coordinates to RGB space. To make the texture relightable, we involve two projections: one from latent to BRDF parameters (yellow curve), and then a projection that is rendering itself, conditioned on light and view.
  • Figure 2: An overview of our ODE UNet. The dashed lines are skip-concatenations. The Down blocks first halve the image size and then double the number of channels. The Up blocks do the opposite. For activation, We use Swish by default, except the second-to-last $1 \times 1$ convolution, which employs Sigmoid for boundedness.
  • Figure 3: Results of our method on six exemplars (six blocks). In each block, time goes left to right, and on the top we show a frame of the exemplar, and below our re-synthesis at that time point. The re-synthesis is performed on a random field three times as high as the original image, to demonstrate we can produce infinite, diverse non-repeating samples. The change of leaf color from green to red starts with medium-sized and dark-red islands that grow in size followed by a global sweep into a darker brown. Our method reproduces well in colors and sizes with the same global transformation evolving in the form of a spatial "front". The growing sprouts leave little to desire, given the complexity of the deformation and the specificity of the shapes involved. Copper successfully crystallizes with crystal shapes and colors matched. The mold textures are accurate over the bread, with dark specks and white hyphae developing. We do well in reproducing the initial combs of melting honey and their partial destruction, even including some letting fluid-like deformation besides the appearance change to darker beige. The dehydrating radish undergoes large changes which we capture well, from colors to patterns, even shadows.
  • Figure 4: Comparisons between our method and baselines (three exemplars). Our method successfully reproduces the evolution of target appearances: The melon rind turns from fresh orange to dark brown with black spots appearing; Water droplets on the window freeze into ice filaments and eventually form frost patterns; The dynamics of cracks are quite accurate, starting from the periphery and gradually progressing.
  • Figure 5: Schematic and photo of our acquisition setup. The optic enclosure is omitted in the photo for simplicity.
  • ...and 7 more figures