HPAC-ML: A Programming Model for Embedding ML Surrogates in Scientific Applications
Zane Fink, Konstantinos Parasyris, Praneet Rathi, Giorgis Georgakoudis, Harshitha Menon, Peer-Timo Bremer
TL;DR
Problem: integrating ML surrogates into HPC scientific applications is difficult due to data-layout mismatches and the need for heavy ML expertise. Approach: hpac-ml introduces a directive-based programming model built on two abstractions—data bridge and execution control—that orchestrate data movement and surrogate inference, with an LLVM-based backend and Torch for inference. Key contributions include a unified data-translation mechanism (memory concretization via tensor functors and tensor maps), an offline data-collection workflow, and Bayesian optimization to automatically explore surrogate architectures. Findings: across five GPU-based HPC benchmarks and thousands of surrogate configurations, the system achieves up to $83.6\times$ end-to-end speedups with RMSE as low as $0.001$, with negligible overhead from layout transformations; the work is open-source and demonstrates practical scalability for ML-assisted HPC.
Abstract
Recent advancements in Machine Learning (ML) have substantially improved its predictive and computational abilities, offering promising opportunities for surrogate modeling in scientific applications. By accurately approximating complex functions with low computational cost, ML-based surrogates can accelerate scientific applications by replacing computationally intensive components with faster model inference. However, integrating ML models into these applications remains a significant challenge, hindering the widespread adoption of ML surrogates as an approximation technique in modern scientific computing. We propose an easy-to-use directive-based programming model that enables developers to seamlessly describe the use of ML models in scientific applications. The runtime support, as instructed by the programming model, performs data assimilation using the original algorithm and can replace the algorithm with model inference. Our evaluation across five benchmarks, testing over 5000 ML models, shows up to 83.6x speed improvements with minimal accuracy loss (as low as 0.01 RMSE).
