CFDBench: A Large-Scale Benchmark for Machine Learning Methods in Fluid Dynamics
Yining Luo, Yingfa Chen, Zhen Zhang
TL;DR
CFDBench provides a large-scale, parameter-conditioned CFD benchmark to rigorously assess neural operators’ generalization to unseen boundary conditions, fluid properties, and domain geometries. It defines four classic flow problems, generates 302K frames via industry-grade solvers, and interpolates data to 64×64 grids, enabling fair comparisons of non-autoregressive and autoregressive models (FFN, DeepONet, FNO, U-Net, etc.). The study reveals significant generalization gaps, with many baselines exhibiting large errors and notable autoregressive error accumulation, underscoring the need for improved parameter-aware architectures and training strategies. By offering a standardized evaluation framework, CFDBench advances rigorous benchmarking and reproducibility for data-driven CFD solvers with practical implications for fast, generalizable surrogate modeling.
Abstract
In recent years, applying deep learning to solve physics problems has attracted much attention. Data-driven deep learning methods produce fast numerical operators that can learn approximate solutions to the whole system of partial differential equations (i.e., surrogate modeling). Although these neural networks may have lower accuracy than traditional numerical methods, they, once trained, are orders of magnitude faster at inference. Hence, one crucial feature is that these operators can generalize to unseen PDE parameters without expensive re-training.In this paper, we construct CFDBench, a benchmark tailored for evaluating the generalization ability of neural operators after training in computational fluid dynamics (CFD) problems. It features four classic CFD problems: lid-driven cavity flow, laminar boundary layer flow in circular tubes, dam flows through the steps, and periodic Karman vortex street. The data contains a total of 302K frames of velocity and pressure fields, involving 739 cases with different operating condition parameters, generated with numerical methods. We evaluate the effectiveness of popular neural operators including feed-forward networks, DeepONet, FNO, U-Net, etc. on CFDBnech by predicting flows with non-periodic boundary conditions, fluid properties, and flow domain shapes that are not seen during training. Appropriate modifications were made to apply popular deep neural networks to CFDBench and enable the accommodation of more changing inputs. Empirical results on CFDBench show many baseline models have errors as high as 300% in some problems, and severe error accumulation when performing autoregressive inference. CFDBench facilitates a more comprehensive comparison between different neural operators for CFD compared to existing benchmarks.
