Table of Contents
Fetching ...

Partial Cross-Compilation and Mixed Execution for Accelerating Dynamic Binary Translation

Yuhao Gu, Zhongchun Zheng, Nong Xiao, Yutong Lu, Xianwei Zhang

TL;DR

The paper tackles the slowdown of cross-ISA execution by introducing TECH-NAME, a hybrid mechanism that offloads selected guest functions to native host code to bypass dynamic binary translation overhead. It integrates compile-time function extraction with runtime bridging to manage ABI differences and interleaved host-guest calls, supported by optimizations (GRT, FCP, PFO) to minimize cross-boundary costs. A practical prototype, IMPL-NAME, built on LLVM and QEMU, demonstrates substantial speedups (up to ~13x on AArch64 and ~19x on x86-64; mean ~3x) and shows applicability to libraries as well as applications. The work demonstrates the practical potential of automated, source-guided offloading for broad real-world software, including legacy and closed-source components, and points to future improvements in adaptive offloading and deeper compiler-emulator co-optimizations.

Abstract

With the growing diversity of instruction set architectures (ISAs), cross-ISA program execution has become common. Dynamic binary translation (DBT) is the main solution but suffers from poor performance. Cross-compilation avoids emulation costs but is constrained by an "all-or-nothing" model-programs are either fully cross-compiled or entirely emulated. Complete cross-compilation is often unfeasible due to ISA-specific code or missing dependencies, leaving programs with high emulation overhead. We propose a hybrid execution system that combines compilation and emulation, featuring a selective function offloading mechanism. This mechanism establishes cross-environment calling channels, offloading eligible functions to the host for native execution to reduce DBT overhead. Key optimizations address offloading costs, enabling efficient hybrid operation. Built on LLVM and QEMU, the system works automatically for both applications and libraries. Evaluations show it achieves up to 13x speedups over existing DBT, with strong practical value.

Partial Cross-Compilation and Mixed Execution for Accelerating Dynamic Binary Translation

TL;DR

The paper tackles the slowdown of cross-ISA execution by introducing TECH-NAME, a hybrid mechanism that offloads selected guest functions to native host code to bypass dynamic binary translation overhead. It integrates compile-time function extraction with runtime bridging to manage ABI differences and interleaved host-guest calls, supported by optimizations (GRT, FCP, PFO) to minimize cross-boundary costs. A practical prototype, IMPL-NAME, built on LLVM and QEMU, demonstrates substantial speedups (up to ~13x on AArch64 and ~19x on x86-64; mean ~3x) and shows applicability to libraries as well as applications. The work demonstrates the practical potential of automated, source-guided offloading for broad real-world software, including legacy and closed-source components, and points to future improvements in adaptive offloading and deeper compiler-emulator co-optimizations.

Abstract

With the growing diversity of instruction set architectures (ISAs), cross-ISA program execution has become common. Dynamic binary translation (DBT) is the main solution but suffers from poor performance. Cross-compilation avoids emulation costs but is constrained by an "all-or-nothing" model-programs are either fully cross-compiled or entirely emulated. Complete cross-compilation is often unfeasible due to ISA-specific code or missing dependencies, leaving programs with high emulation overhead. We propose a hybrid execution system that combines compilation and emulation, featuring a selective function offloading mechanism. This mechanism establishes cross-environment calling channels, offloading eligible functions to the host for native execution to reduce DBT overhead. Key optimizations address offloading costs, enabling efficient hybrid operation. Built on LLVM and QEMU, the system works automatically for both applications and libraries. Evaluations show it achieves up to 13x speedups over existing DBT, with strong practical value.

Paper Structure

This paper contains 26 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: TECH-NAME mechanism enables partial offloading of guest code to the host side for native execution, thereby bypassing the emulation overhead of DBT.
  • Figure 2: Though the whole source code can't be cross-built to the host target, the target-agnostic part of the source may still be exploited to accelerate DBT.
  • Figure 3: The overview of TECH-NAME mechanism.
  • Figure 4: Performance of the x86-64 emulation on AArch64. The bars (left y-axis) show speedup relative to original QEMU, with the folding line (right y-axis) showing its wall time.
  • Figure 5: Number of guest-to-host calls during the execution of all the workloads.
  • ...and 2 more figures