Expression Acceleration: Seamless Parallelization of Typed High-Level Languages
Lars Hummelgren, John Wikman, Oscar Eriksson, Philipp Haller, David Broman
TL;DR
The paper presents Expression Acceleration, a compiler-based approach that lets high-level statically typed languages mark expressions for GPU execution. It extracts the accelerated code with dependencies via lambda lifting, classifies each accelerated binding to a backend (Futhark or CUDA), and automatically generates marshaling code to move data between CPU and GPU. The system supports two backends and enforces well-formedness constraints, with dynamic checks in debug mode and static rules for each backend. Evaluation on Futhark and CUDA benchmarks shows competitive performance and substantial speedups, validating the practicality of seamlessly integrating GPU acceleration into existing high-level programs.
Abstract
Efficient parallelization of algorithms on general-purpose GPUs is essential in many areas today. However, it is a non-trivial task for software engineers to utilize GPUs to improve the performance of high-level programs in general. Although many domain-specific approaches are available for GPU acceleration, it is difficult to accelerate existing high-level programs without rewriting parts of the programs using low-level GPU code. We present a compiler implementation using an alternative approach called expression acceleration. This approach marks expressions for acceleration, and the compiler automatically infers which dependent code needs to be accelerated. We design and implement a compiler supporting expression acceleration for a statically typed functional language and evaluate its applicability and performance.
