Implicit Neural Representation for Physics-driven Actuated Soft Bodies
Lingchen Yang, Byungsoo Kim, Gaspard Zoss, Baran Gözcü, Markus Gross, Barbara Solenthaler
TL;DR
The paper tackles the problem of controlling active soft bodies by learning actuation signals in a continuous, implicit manner. It introduces an implicit neural representation $\mathcal{N}_{\mathbf{A}}(\mathbf{x})$ that maps material-space points to actuation, coupled with a differentiable, quasi-static physics solver to optimize deformations. A key contribution is a closed-form Hessian framework that enables efficient backpropagation through the energy-based solver, along with a modulated SIREN-based network that supports continuous resolution conditioning and facial bone kinematics via $\mathcal{N}_{\mathbf{B}}$. The method demonstrates high-fidelity target matching and smooth pose interpolation across volumetric soft bodies, human motion, and facial expressions, while achieving resolution invariance and reduced parameter counts compared to explicit decoders. This approach offers a general, artist-friendly pathway for pose synthesis and expression animation that remains robust to discretization changes and supports complex actuation and bone-driven facial kinematics.
Abstract
Active soft bodies can affect their shape through an internal actuation mechanism that induces a deformation. Similar to recent work, this paper utilizes a differentiable, quasi-static, and physics-based simulation layer to optimize for actuation signals parameterized by neural networks. Our key contribution is a general and implicit formulation to control active soft bodies by defining a function that enables a continuous mapping from a spatial point in the material space to the actuation value. This property allows us to capture the signal's dominant frequencies, making the method discretization agnostic and widely applicable. We extend our implicit model to mandible kinematics for the particular case of facial animation and show that we can reliably reproduce facial expressions captured with high-quality capture systems. We apply the method to volumetric soft bodies, human poses, and facial expressions, demonstrating artist-friendly properties, such as simple control over the latent space and resolution invariance at test time.
