Unified schemes for directive-based GPU offloading

Yohei Miki; Toshihiro Hanawa

Unified schemes for directive-based GPU offloading

Yohei Miki, Toshihiro Hanawa

TL;DR

The paper tackles the challenge of porting CPU-originated codes to GPUs across multiple vendors by introducing Solomon, a header-only macro library that unifies OpenACC and OpenMP target interfaces. It delivers three notations—an intuitive form plus OpenACC-like and OpenMP-like styles—to ease adoption for both novices and experts and demonstrates the approach on an $N$-body simulation and a $3$D diffusion equation. Solomon enables a single codebase to run with OpenACC on NVIDIA GPUs or OpenMP target on NVIDIA/AMD/Intel GPUs, while preserving the ability to compare backends fairly and transparently. The results show cross-vendor offloading capability with reasonable performance across architectures, highlighting the practical impact of reducing vendor lock-in and learning costs in directive-based GPU programming. Overall, Solomon provides a portable, readable, and easily maintainable path for directive-based GPU offloading across diverse hardware.

Abstract

GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes originally developed for multicore CPUs. Although OpenACC and OpenMP target provide similar features, both methods have pros and cons. OpenACC has better functions and an abundance of documents, but it is virtually for NVIDIA GPUs. OpenMP target supports NVIDIA/AMD/Intel GPUs but has fewer functions than OpenACC. Here, we have developed a header-only library, Solomon (Simple Off-LOading Macros Orchestrating multiple Notations), to unify the interface for GPU offloading with the support of both OpenACC and OpenMP target. Solomon provides three types of notations to reduce users' implementation and learning costs: intuitive notation for beginners and OpenACC/OpenMP-like notations for experienced developers. This manuscript denotes Solomon's implementation and usage and demonstrates the GPU-offloading in $N$-body simulation and the three-dimensional diffusion equation. The library and sample codes are provided as open-source software and publicly and freely available at \url{https://github.com/ymiki-repo/solomon}.

Unified schemes for directive-based GPU offloading

TL;DR

Abstract

Unified schemes for directive-based GPU offloading

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)