Data efficient surrogate modeling for engineering design: Ensemble-free batch mode deep active learning for regression
Sarthak Kapoor, Harsh Vardhan, Umesh Timalsina, Sumit Kumar, Peter Volgyesi, Janos Sztipanovits
TL;DR
The paper addresses the data efficiency challenge of constructing surrogates for expensive high-fidelity simulations in engineering design. It introduces a scalable student–teacher deep active learning framework with epsilon-HQS batch sampling to guide labeling and train surrogates under a fixed budget. Across CFD, FEA, and propeller design domains, epsilon-HQS consistently achieves higher final accuracy than competing acquisition strategies, reducing labeling time substantially. This approach enables accurate, scalable surrogates in high-dimensional design spaces, accelerating design optimization for complex engineering systems.
Abstract
High fidelity design evaluation processes such as Computational Fluid Dynamics and Finite Element Analysis are often replaced with data driven surrogates to reduce computational cost in engineering design optimization. However, building accurate surrogate models still requires a large number of expensive simulations. To address this challenge, we introduce epsilon HQS, a scalable active learning strategy that leverages a student teacher framework to train deep neural networks efficiently. Unlike Bayesian AL methods, which are computationally demanding with DNNs, epsilon HQS selectively queries informative samples to reduce labeling cost. Applied to CFD, FEA, and propeller design tasks, our method achieves higher accuracy under fixed labeling cost budgets.
