Table of Contents
Fetching ...

Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model

Yuqiu Liu, Jingxuan Xu, Mauricio Soroco, Yunchao Wei, Wuyang Chen

TL;DR

The paper tackles the challenge of inferring 3D neural fluid fields from real-world video data with limited views. It introduces a SciML foundation model pretrained on multiphysics PDE simulations that can forecast future fluid states and provide meaningful flow representations. Through a collaborative training scheme that uses forecasted views and a feature-aggregation mechanism from the foundation model, the approach achieves superior data efficiency and reconstruction quality on ScalarFlow compared with prior methods. The results demonstrate that SciML foundation models can meaningfully transfer domain knowledge to real-world fluid dynamics, enabling accurate future predictions with far fewer input views and faster convergence.

Abstract

Recent developments in 3D vision have enabled successful progress in inferring neural fluid fields and realistic rendering of fluid dynamics. However, these methods require real-world flow captures, which demand dense video sequences and specialized lab setups, making the process costly and challenging. Scientific machine learning (SciML) foundation models, which are pretrained on extensive simulations of partial differential equations (PDEs), encode rich multiphysics knowledge and thus provide promising sources of domain priors for inferring fluid fields. Nevertheless, their potential to advance real-world vision problems remains largely underexplored, raising questions about the transferability and practical utility of these foundation models. In this work, we demonstrate that SciML foundation model can significantly improve the data efficiency of inferring real-world 3D fluid dynamics with improved generalization. At the core of our method is leveraging the strong forecasting capabilities and meaningful representations of SciML foundation models. We equip neural fluid fields with a novel collaborative training approach that utilizes augmented views and fluid features extracted by our foundation model. Our method demonstrates significant improvements in both quantitative metrics and visual quality, showcasing the practical applicability of SciML foundation models in real-world fluid dynamics.

Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model

TL;DR

The paper tackles the challenge of inferring 3D neural fluid fields from real-world video data with limited views. It introduces a SciML foundation model pretrained on multiphysics PDE simulations that can forecast future fluid states and provide meaningful flow representations. Through a collaborative training scheme that uses forecasted views and a feature-aggregation mechanism from the foundation model, the approach achieves superior data efficiency and reconstruction quality on ScalarFlow compared with prior methods. The results demonstrate that SciML foundation models can meaningfully transfer domain knowledge to real-world fluid dynamics, enabling accurate future predictions with far fewer input views and faster convergence.

Abstract

Recent developments in 3D vision have enabled successful progress in inferring neural fluid fields and realistic rendering of fluid dynamics. However, these methods require real-world flow captures, which demand dense video sequences and specialized lab setups, making the process costly and challenging. Scientific machine learning (SciML) foundation models, which are pretrained on extensive simulations of partial differential equations (PDEs), encode rich multiphysics knowledge and thus provide promising sources of domain priors for inferring fluid fields. Nevertheless, their potential to advance real-world vision problems remains largely underexplored, raising questions about the transferability and practical utility of these foundation models. In this work, we demonstrate that SciML foundation model can significantly improve the data efficiency of inferring real-world 3D fluid dynamics with improved generalization. At the core of our method is leveraging the strong forecasting capabilities and meaningful representations of SciML foundation models. We equip neural fluid fields with a novel collaborative training approach that utilizes augmented views and fluid features extracted by our foundation model. Our method demonstrates significant improvements in both quantitative metrics and visual quality, showcasing the practical applicability of SciML foundation models in real-world fluid dynamics.

Paper Structure

This paper contains 43 sections, 6 equations, 18 figures, 11 tables.

Figures (18)

  • Figure 1: Inferring neural fluid fields requires densely captured views. Meanwhile, PDE simulations are important for building SciML foundation models. How to utilize this rich domain knowledge and improve 3D fluid reconstruction in the real world?
  • Figure 2: Our method is significantly more data-efficient compared with previous works (PINF chu2022physics, HyFluid yu2024inferring) on future prediction, over different numbers of initial training frames per input video (x-axis). We show the temporal index of reliably predicted future frames, using a peak signal-to-noise ratio (PSNR) threshold of 25, on the y-axis.
  • Figure 3: Forecasting by SciML foundation models mccabe2023multiplehao2024dpot. Given $T_{in}$ previous steps, the model predicts the next step of the fluid dynamics (here, each frame shows the vorticity of the fluid).
  • Figure 4: Overview: We improve the data efficiency (i.e., reduce the number of input fluid views "nf") of learning neural fluid fields via the SciML foundation model. Given sparse input videos, we utilize our foundation model to: 1) forecast future steps to augment denser views for training (Section \ref{['sec:cotraining']}); 2) extract flow representations and aggregate into embeddings of fluid density fields (Section \ref{['sec:feature_aggregation']}).
  • Figure 5: Collaborative training between HyFluid and foundation model via forecasting. "v0, v1, v2" match the corresponding models annotated in Figure \ref{['fig:collaborative_training']}. HyFluid can be progressively improved (v0$\rightarrow$v1$\rightarrow$v2) with more augmented views. Here we show future predictions over different numbers of initial training frames per input video (x-axis). We show the temporal index of reliable predicted future frames (thresholded by PSNR=25) on the y-axis.
  • ...and 13 more figures