Data-Driven Surrogate Modeling Techniques to Predict the Effective Contact Area of Rough Surface Contact Problems
Tarik Sahin, Jacopo Bonari, Sebastian Brandstaeter, Alexander Popp
TL;DR
This work addresses the computational bottleneck of BEM in predicting the effective contact area $A_e$ for rough surface contact problems by building a data-driven surrogate trained on a large BEM-generated database. It compares multiple regression models with grid-search hyperparameter optimization and finds Kernel Ridge Regression to offer the best balance between predictive accuracy and evaluation speed, while Gaussian Process Regressor provides uncertainty quantification at higher cost. The surrogate's generalization to unseen configurations is demonstrated, and a detailed cost analysis shows the break-even point occurs after roughly $1.6\times10^4$ simulations, with database generation as the dominant offline expense. The framework enables efficient multi-query tasks such as uncertainty quantification and parameter identification, expanding the applicability of data-driven surrogates in rough surface contact simulations.
Abstract
The effective contact area in rough surface contact plays a critical role in multi-physics phenomena such as wear, sealing, and thermal or electrical conduction. Although accurate numerical methods, like the Boundary Element Method (BEM), are available to compute this quantity, their high computational cost limits their applicability in multi-query contexts, such as uncertainty quantification, parameter identification, and multi-scale algorithms, where many repeated evaluations are required. This study proposes a surrogate modeling framework for predicting the effective contact area using fast-to-evaluate data-driven techniques. Various machine learning algorithms are trained on a precomputed dataset, where the inputs are the imposed load and statistical roughness parameters, and the output is the corresponding effective contact area. All models undergo hyperparameter optimization to enable fair comparisons in terms of predictive accuracy and computational efficiency, evaluated using established quantitative metrics. Among the models, the Kernel Ridge Regressor demonstrates the best trade-off between accuracy and efficiency, achieving high predictive accuracy, low prediction time, and minimal training overhead-making it a strong candidate for general-purpose surrogate modeling. The Gaussian Process Regressor provides an attractive alternative when uncertainty quantification is required, although it incurs additional computational cost due to variance estimation. The generalization capability of the Kernel Ridge model is validated on an unseen simulation scenario, confirming its ability to transfer to new configurations. Database generation constitutes the dominant cost in the surrogate modeling process. Nevertheless, the approach proves practical and efficient for multi-query tasks, even when accounting for this initial expense.
