Table of Contents
Fetching ...

Unboxing Virgil ADTs for Fun and Profit

Bradley Wei Jie Teo, Ben L. Titzer

TL;DR

This work explores annotation-guided optimizations to ADT representation in Virgil, a systems-level programming language that compiles to x86, x86-64, WebAssembly and the Java Virtual Machine and extends Virgil with annotations: #unboxed to eliminate the overhead of heap allocation via automatic compiler transformation to a scalar representation, and #packed, to enable programmer-expressed bit-layouts.

Abstract

Algebraic Data Types (ADTs) are an increasingly common feature in modern programming languages. In many implementations, values of non-nullary, multi-case ADTs are allocated on the heap, which may reduce performance and increase memory usage. This work explores annotation-guided optimizations to ADT representation in Virgil, a systems-level programming language that compiles to x86, x86-64, Wasm and the Java Virtual Machine. We extend Virgil with annotations: #unboxed to eliminate the overhead of heap allocation via automatic compiler transformation to a scalar representation, and #packed, to enable programmer-expressed bit-layouts. These annotations allow programmers to both save memory and manipulate data in formats dictated by hardware. We dedicate this work as an homage and echo of work done in collaboration with Jens in the work entitled "A Declarative Approach to Generating Machine Code Tools", an unpublished manuscript from 2005. In fact, this work inherits some syntactic conventions from that prior work. The performance impact of these representation changes was evaluated on a variety of workloads in terms of execution time and memory usage, but we don't include it because Jens like semantics and type systems better!

Unboxing Virgil ADTs for Fun and Profit

TL;DR

This work explores annotation-guided optimizations to ADT representation in Virgil, a systems-level programming language that compiles to x86, x86-64, WebAssembly and the Java Virtual Machine and extends Virgil with annotations: #unboxed to eliminate the overhead of heap allocation via automatic compiler transformation to a scalar representation, and #packed, to enable programmer-expressed bit-layouts.

Abstract

Algebraic Data Types (ADTs) are an increasingly common feature in modern programming languages. In many implementations, values of non-nullary, multi-case ADTs are allocated on the heap, which may reduce performance and increase memory usage. This work explores annotation-guided optimizations to ADT representation in Virgil, a systems-level programming language that compiles to x86, x86-64, Wasm and the Java Virtual Machine. We extend Virgil with annotations: #unboxed to eliminate the overhead of heap allocation via automatic compiler transformation to a scalar representation, and #packed, to enable programmer-expressed bit-layouts. These annotations allow programmers to both save memory and manipulate data in formats dictated by hardware. We dedicate this work as an homage and echo of work done in collaboration with Jens in the work entitled "A Declarative Approach to Generating Machine Code Tools", an unpublished manuscript from 2005. In fact, this work inherits some syntactic conventions from that prior work. The performance impact of these representation changes was evaluated on a variety of workloads in terms of execution time and memory usage, but we don't include it because Jens like semantics and type systems better!

Paper Structure

This paper contains 26 sections, 4 equations, 12 figures.

Figures (12)

  • Figure 1: An generic option type written in Virgil, showcasing features of Virgil ADTs, including type parameters, cases with fields, and methods.
  • Figure 2: An example of a Virgil ADT definition and its internal desugaring to classes.
  • Figure 3: A diagram of the Virgil compiler's phases. The largest changes from this work are in purple.
  • Figure 4: A representation of IEEE 754 floating-point numbers using our packing declaration syntax, followed by an example of packing application and concatenation.
  • Figure 5: Syntax for packing expressions and declarations, in Backus-Naur form.
  • ...and 7 more figures