DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks
Mohamed Elrefaie, Florin Morar, Angela Dai, Faez Ahmed
TL;DR
DrivAerNet++ addresses the scarcity of public, high-fidelity, multimodal car-aerodynamics data by delivering 8,000 CFD-rich designs with 24M-cell meshes and 39 TB of multi-modal data. The authors benchmark two core paradigms for drag prediction: geometry-based deep learning on 3D meshes and AutoML on 26-parameter tabular data, demonstrating both the potential and the generalization challenges of current methods across car categories. Key contributions include a 26-parameter parametric design space, multi-modal data (meshes, CFD fields, parametrics, point clouds, and labels), a high-fidelity CFD workflow using OpenFOAM v11, and initial, large-scale ML benchmarks that reveal how dataset scale and representation affect predictive performance. These resources aim to accelerate data-driven automotive design, surrogate modeling, and CFD acceleration while enabling reproducibility and cross-domain learning in aerodynamics.
Abstract
We present DrivAerNet++, the largest and most comprehensive multimodal dataset for aerodynamic car design. DrivAerNet++ comprises 8,000 diverse car designs modeled with high-fidelity computational fluid dynamics (CFD) simulations. The dataset includes diverse car configurations such as fastback, notchback, and estateback, with different underbody and wheel designs to represent both internal combustion engines and electric vehicles. Each entry in the dataset features detailed 3D meshes, parametric models, aerodynamic coefficients, and extensive flow and surface field data, along with segmented parts for car classification and point cloud data. This dataset supports a wide array of machine learning applications including data-driven design optimization, generative modeling, surrogate model training, CFD simulation acceleration, and geometric classification. With more than 39 TB of publicly available engineering data, DrivAerNet++ fills a significant gap in available resources, providing high-quality, diverse data to enhance model training, promote generalization, and accelerate automotive design processes. Along with rigorous dataset validation, we also provide ML benchmarking results on the task of aerodynamic drag prediction, showcasing the breadth of applications supported by our dataset. This dataset is set to significantly impact automotive design and broader engineering disciplines by fostering innovation and improving the fidelity of aerodynamic evaluations. Dataset and code available at: https://github.com/Mohamedelrefaie/DrivAerNet.
