Table of Contents
Fetching ...

DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks

Mohamed Elrefaie, Florin Morar, Angela Dai, Faez Ahmed

TL;DR

DrivAerNet++ addresses the scarcity of public, high-fidelity, multimodal car-aerodynamics data by delivering 8,000 CFD-rich designs with 24M-cell meshes and 39 TB of multi-modal data. The authors benchmark two core paradigms for drag prediction: geometry-based deep learning on 3D meshes and AutoML on 26-parameter tabular data, demonstrating both the potential and the generalization challenges of current methods across car categories. Key contributions include a 26-parameter parametric design space, multi-modal data (meshes, CFD fields, parametrics, point clouds, and labels), a high-fidelity CFD workflow using OpenFOAM v11, and initial, large-scale ML benchmarks that reveal how dataset scale and representation affect predictive performance. These resources aim to accelerate data-driven automotive design, surrogate modeling, and CFD acceleration while enabling reproducibility and cross-domain learning in aerodynamics.

Abstract

We present DrivAerNet++, the largest and most comprehensive multimodal dataset for aerodynamic car design. DrivAerNet++ comprises 8,000 diverse car designs modeled with high-fidelity computational fluid dynamics (CFD) simulations. The dataset includes diverse car configurations such as fastback, notchback, and estateback, with different underbody and wheel designs to represent both internal combustion engines and electric vehicles. Each entry in the dataset features detailed 3D meshes, parametric models, aerodynamic coefficients, and extensive flow and surface field data, along with segmented parts for car classification and point cloud data. This dataset supports a wide array of machine learning applications including data-driven design optimization, generative modeling, surrogate model training, CFD simulation acceleration, and geometric classification. With more than 39 TB of publicly available engineering data, DrivAerNet++ fills a significant gap in available resources, providing high-quality, diverse data to enhance model training, promote generalization, and accelerate automotive design processes. Along with rigorous dataset validation, we also provide ML benchmarking results on the task of aerodynamic drag prediction, showcasing the breadth of applications supported by our dataset. This dataset is set to significantly impact automotive design and broader engineering disciplines by fostering innovation and improving the fidelity of aerodynamic evaluations. Dataset and code available at: https://github.com/Mohamedelrefaie/DrivAerNet.

DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks

TL;DR

DrivAerNet++ addresses the scarcity of public, high-fidelity, multimodal car-aerodynamics data by delivering 8,000 CFD-rich designs with 24M-cell meshes and 39 TB of multi-modal data. The authors benchmark two core paradigms for drag prediction: geometry-based deep learning on 3D meshes and AutoML on 26-parameter tabular data, demonstrating both the potential and the generalization challenges of current methods across car categories. Key contributions include a 26-parameter parametric design space, multi-modal data (meshes, CFD fields, parametrics, point clouds, and labels), a high-fidelity CFD workflow using OpenFOAM v11, and initial, large-scale ML benchmarks that reveal how dataset scale and representation affect predictive performance. These resources aim to accelerate data-driven automotive design, surrogate modeling, and CFD acceleration while enabling reproducibility and cross-domain learning in aerodynamics.

Abstract

We present DrivAerNet++, the largest and most comprehensive multimodal dataset for aerodynamic car design. DrivAerNet++ comprises 8,000 diverse car designs modeled with high-fidelity computational fluid dynamics (CFD) simulations. The dataset includes diverse car configurations such as fastback, notchback, and estateback, with different underbody and wheel designs to represent both internal combustion engines and electric vehicles. Each entry in the dataset features detailed 3D meshes, parametric models, aerodynamic coefficients, and extensive flow and surface field data, along with segmented parts for car classification and point cloud data. This dataset supports a wide array of machine learning applications including data-driven design optimization, generative modeling, surrogate model training, CFD simulation acceleration, and geometric classification. With more than 39 TB of publicly available engineering data, DrivAerNet++ fills a significant gap in available resources, providing high-quality, diverse data to enhance model training, promote generalization, and accelerate automotive design processes. Along with rigorous dataset validation, we also provide ML benchmarking results on the task of aerodynamic drag prediction, showcasing the breadth of applications supported by our dataset. This dataset is set to significantly impact automotive design and broader engineering disciplines by fostering innovation and improving the fidelity of aerodynamic evaluations. Dataset and code available at: https://github.com/Mohamedelrefaie/DrivAerNet.
Paper Structure (41 sections, 6 equations, 20 figures, 10 tables)

This paper contains 41 sections, 6 equations, 20 figures, 10 tables.

Figures (20)

  • Figure 1: Data modalities and shape variations in the DrivAerNet++ dataset.
  • Figure 2: Baseline models from which the parametric models of DrivAerNet++ are derived, demonstrating a range of shape designs and configurations. Variations include estateback, fastback, and notchback car body types alongside different underbody configurations such as smooth and detailed. Wheel options are presented with closed, open, detailed, and smooth styles.
  • Figure 3: Design parameters for the generation of the DrivAerNet++ dataset. Several geometric parameters with significant impact on aerodynamics were selected and varied within a specific range. These parameter ranges were chosen to avoid values that are either difficult to manufacture or not aesthetically pleasing. The car sketch is adapted from heft2012experimental.
  • Figure 4: The scatter plots in the top row illustrate the relationship between $C_d$ and $C_l$ for different configurations: the first plot shows the influence of underbody configurations, comparing detailed versus smooth underbodies typically used in electric cars. The second plot highlights the impact of design aesthetics and style across car categories (notchback, fastback, and estateback). The third plot examines the effect of different wheel configurations, emphasizing the significance of small geometric modifications on aerodynamics. The density plots in the bottom row show the distribution of $C_d$ for the same configurations, providing a detailed view of how these design elements and categories influence aerodynamic efficiency.
  • Figure 5: Drag coefficient prediction based on the parametric data for different car categories. The plots show the median and 95% confidence interval of the $R^2$ score as a function of the percentage of the training data.
  • ...and 15 more figures