A systematic dataset generation technique applied to data-driven automotive aerodynamics
Mark Benjamin, Gianluca Iaccarino
TL;DR
The paper tackles the high cost and limited diversity of CFD data for automotive drag prediction by introducing a SDF-based, barycentric interpolation workflow that generates large, realistic datasets from a few starting geometries (DrivAer configurations). It combines voxelized geometry representations, robust surface reconstruction, and CNN-based surrogates to predict scalar, vector, and tensor aerodynamic quantities, validated against WMLES ground truth. Key findings show high predictive accuracy (e.g., $C_d$ errors within tens of drag counts), useful extrapolation capabilities, and clearer guidance on data sampling strategies. The approach enables potential universal drag predictors and provides a foundation for uncertainty quantification and future enhancements with alternative geometry representations and out-of-distribution detection.
Abstract
A novel strategy for generating datasets is developed within the context of drag prediction for automotive geometries using neural networks. A primary challenge in this space is constructing a training databse of sufficient size and diversity. Our method relies on a small number of starting data points, and provides a recipe to interpolate systematically between them, generating an arbitrary number of samples at the desired quality. We test this strategy using a realistic automotive geometry, and demonstrate that convolutional neural networks perform exceedingly well at predicting drag coefficients and surface pressures. Promising results are obtained in testing extrapolation performance. Our method can be applied to other problems of aerodynamic shape optimization.
