ACC Saturator: Automatic Kernel Optimization for Directive-Based GPU Code
Kazuaki Matsumura, Simon Garcia De Gonzalo, Antonio J. Peña
TL;DR
This paper proposes equality saturation to optimize sequential codes utilized in directive-based programming for GPUs, and proposes a fully-automated framework that realizes less computation, less memory access, and high memory throughput simultaneously.
Abstract
Automatic code optimization is a complex process that typically involves the application of multiple discrete algorithms that modify the program structure irreversibly. However, the design of these algorithms is often monolithic, and they require repetitive implementation to perform similar analyses due to the lack of cooperation. To address this issue, modern optimization techniques, such as equality saturation, allow for exhaustive term rewriting at various levels of inputs, thereby simplifying compiler design. In this paper, we propose equality saturation to optimize sequential codes utilized in directive-based programming for GPUs. Our approach realizes less computation, less memory access, and high memory throughput simultaneously. Our fully-automated framework constructs single-assignment forms from inputs to be entirely rewritten while keeping dependencies and extracts optimal cases. Through practical benchmarks, we demonstrate a significant performance improvement on several compilers. Furthermore, we highlight the advantages of computational reordering and emphasize the significance of memory-access order for modern GPUs.
