Table of Contents
Fetching ...

3D Dynamic Fluid Assets from Single-View Videos with Generative Gaussian Splatting

Zhiwei Zhao, Alan Zhao, Minchen Li, Yixin Hu

TL;DR

The paper tackles the problem of producing physically plausible 3D dynamic fluid assets from single-view videos by combining generative 3D Gaussian Splatting geometry with physics-based, differentiable fluid simulation. It introduces a two-stage pipeline: first reconstruct geometry and motion from frames using 3DGS and optical-flow-derived surface velocities with corrections, then optimize simulation parameters within a differentiable APIC framework to reproduce observed dynamics while enforcing divergence-free velocity fields. Key contributions include (i) a preprocessing scheme to densify and stabilize frame-consistent 3DGS for fluids, (ii) a mainstream-guided, physics-constrained velocity estimation method that leverages depth and boundary damping, and (iii) a differentiable grid-based optimization workflow using a learned Poisson solver to align simulated velocities with video data, producing editable, simulation-ready fluid assets. The approach supports various fluid types, provides editable geometry and dynamics, and enables practical production use by leveraging open-source 3DGS methods and GPU-accelerated simulation. This work holds potential for scalable generation of 3D fluid content from widely available single-view videos, with implications for digital content creation and cinematic visual effects.

Abstract

While the generation of 3D content from single-view images has been extensively studied, the creation of physically consistent 3D dynamic scenes from videos remains in its early stages. We propose a novel framework leveraging generative 3D Gaussian Splatting (3DGS) models to extract and re-simulate 3D dynamic fluid objects from single-view videos using simulation methods. The fluid geometry represented by 3DGS is initially generated and optimized from single-view images, then denoised, densified, and aligned across frames. We estimate the fluid surface velocity using optical flow, propose a mainstream extraction algorithm to refine it. The 3D volumetric velocity field is then derived from the velocity of the fluid's enclosed surface. The velocity field is therewith converted into a divergence-free, grid-based representation, enabling the optimization of simulation parameters through its differentiability across frames. This process outputs simulation-ready fluid assets with physical dynamics closely matching those observed in the source video. Our approach is applicable to various liquid fluids, including inviscid and viscous types, and allows users to edit the output geometry or extend movement durations seamlessly. This automatic method for creating 3D dynamic fluid assets from single-view videos, easily obtainable from the internet, shows great potential for generating large-scale 3D fluid assets at a low cost.

3D Dynamic Fluid Assets from Single-View Videos with Generative Gaussian Splatting

TL;DR

The paper tackles the problem of producing physically plausible 3D dynamic fluid assets from single-view videos by combining generative 3D Gaussian Splatting geometry with physics-based, differentiable fluid simulation. It introduces a two-stage pipeline: first reconstruct geometry and motion from frames using 3DGS and optical-flow-derived surface velocities with corrections, then optimize simulation parameters within a differentiable APIC framework to reproduce observed dynamics while enforcing divergence-free velocity fields. Key contributions include (i) a preprocessing scheme to densify and stabilize frame-consistent 3DGS for fluids, (ii) a mainstream-guided, physics-constrained velocity estimation method that leverages depth and boundary damping, and (iii) a differentiable grid-based optimization workflow using a learned Poisson solver to align simulated velocities with video data, producing editable, simulation-ready fluid assets. The approach supports various fluid types, provides editable geometry and dynamics, and enables practical production use by leveraging open-source 3DGS methods and GPU-accelerated simulation. This work holds potential for scalable generation of 3D fluid content from widely available single-view videos, with implications for digital content creation and cinematic visual effects.

Abstract

While the generation of 3D content from single-view images has been extensively studied, the creation of physically consistent 3D dynamic scenes from videos remains in its early stages. We propose a novel framework leveraging generative 3D Gaussian Splatting (3DGS) models to extract and re-simulate 3D dynamic fluid objects from single-view videos using simulation methods. The fluid geometry represented by 3DGS is initially generated and optimized from single-view images, then denoised, densified, and aligned across frames. We estimate the fluid surface velocity using optical flow, propose a mainstream extraction algorithm to refine it. The 3D volumetric velocity field is then derived from the velocity of the fluid's enclosed surface. The velocity field is therewith converted into a divergence-free, grid-based representation, enabling the optimization of simulation parameters through its differentiability across frames. This process outputs simulation-ready fluid assets with physical dynamics closely matching those observed in the source video. Our approach is applicable to various liquid fluids, including inviscid and viscous types, and allows users to edit the output geometry or extend movement durations seamlessly. This automatic method for creating 3D dynamic fluid assets from single-view videos, easily obtainable from the internet, shows great potential for generating large-scale 3D fluid assets at a low cost.

Paper Structure

This paper contains 28 sections, 11 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: Our framework consists of five stages: (1) generating 3DGS representation from input frames and preprocessing the 3D Gaussians, (2) estimating 2D screen-space velocities using optical flow with mainstream correction, (3) combining with depth information to obtain 3D velocities while extracting terrain geometry, (4) optimizing fluid properties through differentiable simulation, and (5) post-processing for final rendering.
  • Figure 2: Preprocessed results for generated 3DGS. The left column is the direct generation, and the right column is processed with the single-view optimization.
  • Figure 3: The filling operation inserts Gaussians into sparse generated 3DGS in (a) and output a dense 3DGS in (b). Our union strategy merges the generated 3DGS from multiple frames and outputs a higher-fidelity geometry in (c).
  • Figure 4: The unguided result shows vanished velocity in regions such as the one highlighted in the close-up (left). After applying mainstream-guided neighboring interpolation, we obtain the result shown in the middle. With further physics-constrained velocity correction, we achieve a more meaningful velocity field (right).
  • Figure 5: Illustration (left) and a real volumetric velocity field (right). The velocity is dampen near the riverbed. On the right figure, the darker the color, the smaller the velocity.
  • ...and 10 more figures