Learning Generalizable Program and Architecture Representations for Performance Modeling
Lingda Li, Thomas Flynn, Adolfy Hoisie
TL;DR
PerfVec tackles the challenge of generalizable performance modeling by learning independent, orthogonal representations for programs and microarchitectures and by composing a program’s representation from the representations of its executed instructions. The framework introduces a foundation model for instructions and uses microarchitecture sampling to train without a full architecture model, enabling rapid cross-architecture predictions via a simple dot product between program and microarchitecture representations. Key contributions include (1) a compositional representation scheme $\bm{R}_p = \sum_i \bm{R}_i$ with $T = \bm{R}_p \cdot \bm{M}$, (2) a scalable training strategy combining instruction representation reuse and microarchitecture sampling, and (3) demonstrations of strong accuracy and generality on unseen programs and architectures, plus practical applications in design space exploration and loop tiling analysis. The approach significantly reduces training costs and prediction latency while offering broad applicability, potentially transforming performance modeling workflows in HPC and systems design.
Abstract
Performance modeling is an essential tool in many areas, including performance characterization/optimization, design space exploration, and resource allocation problems, to name a few. However, existing performance modeling approaches have limitations, such as high computational cost for discrete-event simulators, narrow flexibility of hardware emulators, or restricted accuracy/generality of analytical/data-driven models. To address these limitations, this paper proposes PerfVec, a novel deep learning-based performance modeling framework that learns high-dimensional and independent/orthogonal program and microarchitecture representations. Once learned, a program representation can be used to predict its performance on any microarchitecture, and likewise, a microarchitecture representation can be applied in the performance prediction of any program. Additionally, PerfVec yields a foundation model that captures the performance essence of instructions, which can be directly used by developers in numerous performance modeling related tasks without incurring its training cost. The evaluation demonstrates that PerfVec is more general and efficient than previous approaches.
