Table of Contents
Fetching ...

Inside VOLT: Designing an Open-Source GPU Compiler

Shinnung Jeong, Chihyo Ahn, Huanzhi Pu, Jisheng Zhao, Hyesoon Kim, Blaise Tine

TL;DR

Open-source GPU compiler support for SIMT is fragmented and hard to extend. VOLT proposes a three-layer, middle-end–centric framework that centralizes SIMT analyses and divergencemanagement, enabling reusable optimizations across front-ends and open GPU variants. It demonstrates extensibility via two case studies (ISA extensions and host-runtime APIs) and shows favorable compile-time overhead, correctness across 32 OpenCL and 17 CUDA benchmarks, and practical pathways for adopting VOLT in open-GPU ecosystems. The framework has the potential to accelerate research and experimentation in open GPU architectures by providing portable, extensible tooling and a transparent compilation pipeline.

Abstract

Recent efforts in open-source GPU research are opening new avenues in a domain that has long been tightly coupled with a few commercial vendors. Emerging open GPU architectures define SIMT functionality through their own ISAs, but executing existing GPU programs and optimizing performance on these ISAs relies on a compiler framework that is technically complex and often undercounted in open hardware development costs. To address this challenge, the Vortex-Optimized Lightweight Toolchain (VOLT) has been proposed. This paper presents its design principles, overall structure, and the key compiler transformations required to support SIMT execution on Vortex. VOLT enables SIMT code generation and optimization across multiple levels of abstraction through a hierarchical design that accommodates diverse front-end languages and open GPU hardware. To ensure extensibility as GPU architectures evolve, VOLT centralizes fundamental SIMT-related analyses and optimizations in the middle-end, allowing them to be reused across front-ends and easily adapted to emerging open-GPU variants. Through two case studies on ISA extensions and host-runtime API, this paper also demonstrates how VOLT can support extensions

Inside VOLT: Designing an Open-Source GPU Compiler

TL;DR

Open-source GPU compiler support for SIMT is fragmented and hard to extend. VOLT proposes a three-layer, middle-end–centric framework that centralizes SIMT analyses and divergencemanagement, enabling reusable optimizations across front-ends and open GPU variants. It demonstrates extensibility via two case studies (ISA extensions and host-runtime APIs) and shows favorable compile-time overhead, correctness across 32 OpenCL and 17 CUDA benchmarks, and practical pathways for adopting VOLT in open-GPU ecosystems. The framework has the potential to accelerate research and experimentation in open GPU architectures by providing portable, extensible tooling and a transparent compilation pipeline.

Abstract

Recent efforts in open-source GPU research are opening new avenues in a domain that has long been tightly coupled with a few commercial vendors. Emerging open GPU architectures define SIMT functionality through their own ISAs, but executing existing GPU programs and optimizing performance on these ISAs relies on a compiler framework that is technically complex and often undercounted in open hardware development costs. To address this challenge, the Vortex-Optimized Lightweight Toolchain (VOLT) has been proposed. This paper presents its design principles, overall structure, and the key compiler transformations required to support SIMT execution on Vortex. VOLT enables SIMT code generation and optimization across multiple levels of abstraction through a hierarchical design that accommodates diverse front-end languages and open GPU hardware. To ensure extensibility as GPU architectures evolve, VOLT centralizes fundamental SIMT-related analyses and optimizations in the middle-end, allowing them to be reused across front-ends and easily adapted to emerging open-GPU variants. Through two case studies on ISA extensions and host-runtime API, this paper also demonstrates how VOLT can support extensions

Paper Structure

This paper contains 26 sections, 10 figures, 2 tables, 2 algorithms.

Figures (10)

  • Figure 1: Illustration of thread execution with control-flow divergence (assuming a total of four threads).
  • Figure 2: Example of machine-level code for control-flow constructs: if-else and loop
  • Figure 3: Vortex architecture vortex
  • Figure 4: Overview of the VOLT framework. Blue components are proposed or extended to support the Vortex GPU. Green boxes represent the host-side compilation flow, and others represent the kernel compilation flow.
  • Figure 5: Challenges of Split/join IR insertion
  • ...and 5 more figures