TPDE: A Fast Adaptable Compiler Back-End Framework
Tobias Schwarz, Tobias Kamm, Alexis Engelke
TL;DR
TPDE presents a fast, adaptable compiler back-end framework for SSA-form IRs that eliminates the need for an IR translation step. By coupling an IR adapter with architecture-aware instruction compilers and optionally architecture-specific snippet encoders derived from LLVM Machine IR, TPDE achieves single-pass code generation while performing a separate liveness analysis to guide register allocation and spills. The framework is demonstrated via back-ends for LLVM-IR (x86-64/AArch64), Cranelift IR for WebAssembly, and Umbra IR, achieving 8--24x faster compile times than LLVM -O0 with comparable run-time performance and substantial end-to-end improvements in JIT contexts. These results show that adopting TPDE substantially reduces compilation latency across diverse IRs while keeping code quality competitive, enabling faster startup for dynamic languages, databases, and WASM runtimes. The approach also lowers maintenance and porting costs by reusing high-level snippet encoders and providing IR-agnostic components that can be mixed to target new architectures.
Abstract
Fast machine code generation is especially important for fast start-up just-in-time compilation, where the compilation time is part of the end-to-end latency. However, widely used compiler frameworks like LLVM do not prioritize fast compilation and require an extra IR translation step increasing latency even further; and rolling a custom code generator is a substantial engineering effort, especially when targeting multiple architectures. Therefore, in this paper, we present TPDE, a compiler back-end framework that adapts to existing code representations in SSA form. Using an IR-specific adapter providing canonical access to IR data structures and a specification of the IR semantics, the framework performs one analysis pass and then performs the compilation in just a single pass, combining instruction selection, register allocation, and instruction encoding. The generated target instructions are primarily derived code written in high-level language through LLVM's Machine IR, easing portability to different architectures while enabling optimizations during code generation. To show the generality of our framework, we build a new back-end for LLVM from scratch targeting x86-64 and AArch64. Performance results on SPECint 2017 show that we can compile LLVM-IR 8--24x faster than LLVM -O0 while being on-par in terms of run-time performance. We also demonstrate the benefits of adapting to domain-specific IRs in JIT contexts, particularly WebAssembly and database query compilation, where avoiding the extra IR translation further reduces compilation latency.
