Towards Multi-Fidelity Scaling Laws of Neural Surrogates in CFD

Paul Setinek; Gianluca Galletti; Johannes Brandstetter

Towards Multi-Fidelity Scaling Laws of Neural Surrogates in CFD

Paul Setinek, Gianluca Galletti, Johannes Brandstetter

TL;DR

The paper addresses the data-cost challenge in scientific ML by introducing multi-fidelity scaling laws that split the data axis into a compute budget $D_b$ and fidelity composition $D_c$. It evaluates a transformer-based neural surrogate on a CFD Airfoil dataset with paired low- and high-fidelity simulations to reveal a compute-budget scaling law and budget-dependent optimal fidelity mixes. A physical explanation links transfer patterns to boundary-layer treatment, showing positive LF-to-HF transfer for some fields but not for wall shear stress due to near-wall modeling differences. The findings offer practical guidance for compute-efficient dataset generation in scientific ML and outline directions for extending fidelities, adopting continuous fidelities, and generalizing to other domains.

Abstract

Scaling laws describe how model performance grows with data, parameters and compute. While large datasets can usually be collected at relatively low cost in domains such as language or vision, scientific machine learning is often limited by the high expense of generating training data through numerical simulations. However, by adjusting modeling assumptions and approximations, simulation fidelity can be traded for computational cost, an aspect absent in other domains. We investigate this trade-off between data fidelity and cost in neural surrogates using low- and high-fidelity Reynolds-Averaged Navier-Stokes (RANS) simulations. Reformulating classical scaling laws, we decompose the dataset axis into compute budget and dataset composition. Our experiments reveal compute-performance scaling behavior and exhibit budget-dependent optimal fidelity mixes for the given dataset configuration. These findings provide the first study of empirical scaling laws for multi-fidelity neural surrogate datasets and offer practical considerations for compute-efficient dataset generation in scientific machine learning.

Towards Multi-Fidelity Scaling Laws of Neural Surrogates in CFD

TL;DR

Abstract

Towards Multi-Fidelity Scaling Laws of Neural Surrogates in CFD

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)