Table of Contents
Fetching ...

Pydrofoil: accelerating Sail-based instruction set simulators

Carl Friedrich Bolz-Tereick, Luke Panayi, Ferdia McKeogh, Tom Spink, Martin Berger

TL;DR

ISA specifications in Sail enable formal verification but traditionally yield slow ISSs when compiled to C. Pydrofoil combines an ahead-of-time compiler with a PyPy-based tracing JIT and domain-specific optimisations to accelerate Sail-generated ISSs, achieving over $230\times$ speedup on a RISC-V model relative to the Sail C backend. Ablation studies identify the JIT and static bitvector/integer optimisations as the largest contributors, with memory-immutability tracking and inlining providing additional gains. While the approach moves toward the performance of hand-tuned ISSs like QEMU, there remains a substantial gap (about $26.7\times$ on average in reported benchmarks), prompting a concrete set of future improvements and cross-ISA applications (Arm, CHERIoT). Overall, Pydrofoil demonstrates the viability of compiling high-level ISA specifications into performant, JIT-accelerated ISSs and outlines a practical roadmap for achieving parity with industrial emulators.

Abstract

We present Pydrofoil, a multi-stage compiler that generates instruction set simulators (ISSs) from processor instruction set architectures (ISAs) expressed in the high-level, verification-oriented ISA specification language Sail. Pydrofoil shows a > 230x speedup over the C-based ISS generated by Sail on our benchmarks, and is based on the following insights. (i) An ISS is effectively an interpreter loop, and tracing just-in-time (JIT) compilers have proven effective at accelerating those, albeit mostly for dynamically typed languages. (ii) ISS workloads are highly atypical, dominated by intensive bit manipulation operations. Conventional compiler optimisations for general-purpose programming languages have limited impact for speeding up such workloads. We develop suitable domain-specific optimisations. (iii) Neither tracing JIT compilers, nor ahead-of-time (AOT) compilation alone, even with domain-specific optimisations, suffice for the generation of performant ISSs. Pydrofoil therefore implements a hybrid approach, pairing an AOT compiler with a tracing JIT built on the meta-tracing PyPy framework. AOT and JIT use domain-specific optimisations. Our benchmarks demonstrate that combining AOT and JIT compilers provides significantly greater performance gains than using either compiler alone.

Pydrofoil: accelerating Sail-based instruction set simulators

TL;DR

ISA specifications in Sail enable formal verification but traditionally yield slow ISSs when compiled to C. Pydrofoil combines an ahead-of-time compiler with a PyPy-based tracing JIT and domain-specific optimisations to accelerate Sail-generated ISSs, achieving over speedup on a RISC-V model relative to the Sail C backend. Ablation studies identify the JIT and static bitvector/integer optimisations as the largest contributors, with memory-immutability tracking and inlining providing additional gains. While the approach moves toward the performance of hand-tuned ISSs like QEMU, there remains a substantial gap (about on average in reported benchmarks), prompting a concrete set of future improvements and cross-ISA applications (Arm, CHERIoT). Overall, Pydrofoil demonstrates the viability of compiling high-level ISA specifications into performant, JIT-accelerated ISSs and outlines a practical roadmap for achieving parity with industrial emulators.

Abstract

We present Pydrofoil, a multi-stage compiler that generates instruction set simulators (ISSs) from processor instruction set architectures (ISAs) expressed in the high-level, verification-oriented ISA specification language Sail. Pydrofoil shows a > 230x speedup over the C-based ISS generated by Sail on our benchmarks, and is based on the following insights. (i) An ISS is effectively an interpreter loop, and tracing just-in-time (JIT) compilers have proven effective at accelerating those, albeit mostly for dynamically typed languages. (ii) ISS workloads are highly atypical, dominated by intensive bit manipulation operations. Conventional compiler optimisations for general-purpose programming languages have limited impact for speeding up such workloads. We develop suitable domain-specific optimisations. (iii) Neither tracing JIT compilers, nor ahead-of-time (AOT) compilation alone, even with domain-specific optimisations, suffice for the generation of performant ISSs. Pydrofoil therefore implements a hybrid approach, pairing an AOT compiler with a tracing JIT built on the meta-tracing PyPy framework. AOT and JIT use domain-specific optimisations. Our benchmarks demonstrate that combining AOT and JIT compilers provides significantly greater performance gains than using either compiler alone.

Paper Structure

This paper contains 33 sections, 14 figures.

Figures (14)

  • Figure 1: Sail's multi-stage compilation pipeline from Sail to an ISS running as a static binary.
  • Figure 2: Two of the RISC-V instruction formats WatermanA:riscvISAvolI.
  • Figure 3: Fragment of the $\mathtt{execute}$ clause of $\mathtt{ITYPE}$ instructions after compilation to C by Sail.
  • Figure 4: Pydrofoil multi-stage compilation pipeline from Sail to a JITed ISS.
  • Figure 5: JIB code for the $\mathtt{execute}$ clause of $\mathtt{ITYPE}$ instructions.
  • ...and 9 more figures