DrivAerML: High-Fidelity Computational Fluid Dynamics Dataset for Road-Car External Aerodynamics
Neil Ashton, Charles Mockett, Marian Fuchs, Louis Fliessbach, Hendrik Hetmann, Thilo Knacke, Norbert Schonwald, Vangelis Skaperdas, Grigoris Fotiadis, Astrid Walle, Burkhard Hupertz, Danielle Maddix
TL;DR
The DrivAerML work tackles the lack of open-source, high-fidelity CFD data for realistic road-car external aerodynamics by delivering a public CC-BY-SA dataset of 500 morphologically varied DrivAer notchback geometries generated with a scale-resolving CFD workflow. The pipeline combines ANSA-based geometry morphing, automated meshing with HeXtreme, and OpenFOAM-based SA-σ-DDES simulations, with ~160 million cells and time-averaging governed by Meancalc to ensure consistent statistical accuracy across cases. A 16-parameter design-of-experiments space (Modified Extensible Lattice Sequence) covers the geometry variations, and outputs include full 3D flow fields, surface data, 2D slices, and time-averaged force/moment coefficients, all in open formats. By linking to AhmedML and WindsorML, the dataset enables cross-didelity benchmarking and transfer learning for ML surrogates, potentially accelerating robust design optimization in automotive aerodynamics.
Abstract
Machine Learning (ML) has the potential to revolutionise the field of automotive aerodynamics, enabling split-second flow predictions early in the design process. However, the lack of open-source training data for realistic road cars, using high-fidelity CFD methods, represents a barrier to their development. To address this, a high-fidelity open-source (CC-BY-SA) public dataset for automotive aerodynamics has been generated, based on 500 parametrically morphed variants of the widely-used DrivAer notchback generic vehicle. Mesh generation and scale-resolving CFD was executed using consistent and validated automatic workflows representative of the industrial state-of-the-art. Geometries and rich aerodynamic data are published in open-source formats. To our knowledge, this is the first large, public-domain dataset for complex automotive configurations generated using high-fidelity CFD.
