Data efficient surrogate modeling for engineering design: Ensemble-free batch mode deep active learning for regression

Sarthak Kapoor; Harsh Vardhan; Umesh Timalsina; Sumit Kumar; Peter Volgyesi; Janos Sztipanovits

Data efficient surrogate modeling for engineering design: Ensemble-free batch mode deep active learning for regression

Sarthak Kapoor, Harsh Vardhan, Umesh Timalsina, Sumit Kumar, Peter Volgyesi, Janos Sztipanovits

TL;DR

The paper addresses the data efficiency challenge of constructing surrogates for expensive high-fidelity simulations in engineering design. It introduces a scalable student–teacher deep active learning framework with epsilon-HQS batch sampling to guide labeling and train surrogates under a fixed budget. Across CFD, FEA, and propeller design domains, epsilon-HQS consistently achieves higher final accuracy than competing acquisition strategies, reducing labeling time substantially. This approach enables accurate, scalable surrogates in high-dimensional design spaces, accelerating design optimization for complex engineering systems.

Abstract

High fidelity design evaluation processes such as Computational Fluid Dynamics and Finite Element Analysis are often replaced with data driven surrogates to reduce computational cost in engineering design optimization. However, building accurate surrogate models still requires a large number of expensive simulations. To address this challenge, we introduce epsilon HQS, a scalable active learning strategy that leverages a student teacher framework to train deep neural networks efficiently. Unlike Bayesian AL methods, which are computationally demanding with DNNs, epsilon HQS selectively queries informative samples to reduce labeling cost. Applied to CFD, FEA, and propeller design tasks, our method achieves higher accuracy under fixed labeling cost budgets.

Data efficient surrogate modeling for engineering design: Ensemble-free batch mode deep active learning for regression

TL;DR

Abstract

Paper Structure (7 sections, 12 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 7 sections, 12 equations, 2 figures, 1 table, 1 algorithm.

Problem Statement
Approach
Batching
Epsilon($\epsilon$)-weighted Hybrid Query Strategy
Experiments and Results
Related work
Conclusions

Figures (2)

Figure 1: Student Teacher Architecture: The Process
Figure 2: The comparison of expected/mean test accuracy of trained surrogate in different domain using all proposed approaches at the different iterations of training. DBAL50 (Diverse Batch Active Learning with $\beta$=50), DBAL10 (Diverse Batch Active Learning with $\beta$=10), random (batch uniformly random), ep_025 ($\epsilon-HQS$ with constant $\epsilon$=0.25), ep_05 ($\epsilon-HQS$ with constant $\epsilon$=0.5), ep_075 ($\epsilon-HQS$ with constant $\epsilon$=0.75), ep_1 ($\epsilon-HQS$ with constant $\epsilon$=1.0), ep_greedy ($\epsilon-HQS$ with logarithmic increasing $\epsilon$).

Data efficient surrogate modeling for engineering design: Ensemble-free batch mode deep active learning for regression

TL;DR

Abstract

Data efficient surrogate modeling for engineering design: Ensemble-free batch mode deep active learning for regression

Authors

TL;DR

Abstract

Table of Contents

Figures (2)