Table of Contents
Fetching ...

Implicit Neural Representation for Physics-driven Actuated Soft Bodies

Lingchen Yang, Byungsoo Kim, Gaspard Zoss, Baran Gözcü, Markus Gross, Barbara Solenthaler

TL;DR

The paper tackles the problem of controlling active soft bodies by learning actuation signals in a continuous, implicit manner. It introduces an implicit neural representation $\mathcal{N}_{\mathbf{A}}(\mathbf{x})$ that maps material-space points to actuation, coupled with a differentiable, quasi-static physics solver to optimize deformations. A key contribution is a closed-form Hessian framework that enables efficient backpropagation through the energy-based solver, along with a modulated SIREN-based network that supports continuous resolution conditioning and facial bone kinematics via $\mathcal{N}_{\mathbf{B}}$. The method demonstrates high-fidelity target matching and smooth pose interpolation across volumetric soft bodies, human motion, and facial expressions, while achieving resolution invariance and reduced parameter counts compared to explicit decoders. This approach offers a general, artist-friendly pathway for pose synthesis and expression animation that remains robust to discretization changes and supports complex actuation and bone-driven facial kinematics.

Abstract

Active soft bodies can affect their shape through an internal actuation mechanism that induces a deformation. Similar to recent work, this paper utilizes a differentiable, quasi-static, and physics-based simulation layer to optimize for actuation signals parameterized by neural networks. Our key contribution is a general and implicit formulation to control active soft bodies by defining a function that enables a continuous mapping from a spatial point in the material space to the actuation value. This property allows us to capture the signal's dominant frequencies, making the method discretization agnostic and widely applicable. We extend our implicit model to mandible kinematics for the particular case of facial animation and show that we can reliably reproduce facial expressions captured with high-quality capture systems. We apply the method to volumetric soft bodies, human poses, and facial expressions, demonstrating artist-friendly properties, such as simple control over the latent space and resolution invariance at test time.

Implicit Neural Representation for Physics-driven Actuated Soft Bodies

TL;DR

The paper tackles the problem of controlling active soft bodies by learning actuation signals in a continuous, implicit manner. It introduces an implicit neural representation that maps material-space points to actuation, coupled with a differentiable, quasi-static physics solver to optimize deformations. A key contribution is a closed-form Hessian framework that enables efficient backpropagation through the energy-based solver, along with a modulated SIREN-based network that supports continuous resolution conditioning and facial bone kinematics via . The method demonstrates high-fidelity target matching and smooth pose interpolation across volumetric soft bodies, human motion, and facial expressions, while achieving resolution invariance and reduced parameter counts compared to explicit decoders. This approach offers a general, artist-friendly pathway for pose synthesis and expression animation that remains robust to discretization changes and supports complex actuation and bone-driven facial kinematics.

Abstract

Active soft bodies can affect their shape through an internal actuation mechanism that induces a deformation. Similar to recent work, this paper utilizes a differentiable, quasi-static, and physics-based simulation layer to optimize for actuation signals parameterized by neural networks. Our key contribution is a general and implicit formulation to control active soft bodies by defining a function that enables a continuous mapping from a spatial point in the material space to the actuation value. This property allows us to capture the signal's dominant frequencies, making the method discretization agnostic and widely applicable. We extend our implicit model to mandible kinematics for the particular case of facial animation and show that we can reliably reproduce facial expressions captured with high-quality capture systems. We apply the method to volumetric soft bodies, human poses, and facial expressions, demonstrating artist-friendly properties, such as simple control over the latent space and resolution invariance at test time.
Paper Structure (33 sections, 31 equations, 11 figures, 1 table)

This paper contains 33 sections, 31 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Overview of our method. Using a set of observations $\mathbf{s}$ (target poses) we learn actuation signals $\mathcal{A}$ and mandible kinematics $\mathbf{u}_d$ (for faces only), such that when using these parameters in a forward pass the simulation output $\hat{\mathbf{s}}$ matches ground truth. We implicitly represent the two mechanical properties using the networks $\mathcal{N}_{\mathbf{A}}$ and $\mathcal{N}_{\mathbf{B}}$, and couple them with a differentiable quasi-static soft body simulator to allow gradient information to flow from the solver to the networks. The encoder is a global shape descriptor and outputs a latent code $\mathbf{z}$.
  • Figure 2: Training results using the starfish, human body and face datasets. From left to right: target pose, simulation result, reconstruction error, optimized actuation magnitudes.
  • Figure 3: Optimized jaw position (middle) and color-coded magnitude of the movement (right) compared to the initialization (left).
  • Figure 4: Simulation of unseen target poses. From left to right: target pose, simulated result, reconstruction error, optimized actuation magnitudes.
  • Figure 5: Latent space interpolation between two selected expressions.
  • ...and 6 more figures