Table of Contents
Fetching ...

Wasure: A Modular Toolkit for Comprehensive WebAssembly Benchmarking

Riccardo Carissimi, Ben L. Titzer

TL;DR

Wasure tackles the challenge of benchmarking WebAssembly engines across diverse runtimes, hardware, languages, and configurations by introducing a modular CLI toolkit that streamlines benchmark execution, comparison, and result handling. The approach emphasizes design goals of ease of use, extensibility, portability, and accuracy, supported by a three-phase pipeline (Preparation, Execution, Evaluation) and modular core components for benchmarks, runtimes, execution, results, visualization, and feature checks. Complementing performance evaluation, the paper presents a dynamic analysis of Wasure's benchmark suites using Wizard instrumentation, revealing substantial variation in coverage, hot paths, memory activity, and control-flow patterns across benchmark groups, and highlighting the need for both synthetic and realistic workloads. Collectively, Wasure enables more systematic, transparent, and insightful WebAssembly evaluations, with practical impact for researchers and developers designing and assessing Wasm engines and toolchains.

Abstract

WebAssembly (Wasm) has become a key compilation target for portable and efficient execution across diverse platforms. Benchmarking its performance, however, is a multi-dimensional challenge: it depends not only on the choice of runtime engines, but also on hardware architectures, application domains, source languages, benchmark suites, and runtime configurations. This paper introduces Wasure, a modular and extensible command-line toolkit that simplifies the execution and comparison of WebAssembly benchmarks. To complement performance evaluation, we also conducted a dynamic analysis of the benchmark suites included with Wasure. Our analysis reveals substantial differences in code coverage, control flow, and execution patterns, emphasizing the need for benchmark diversity. Wasure aims to support researchers and developers in conducting more systematic, transparent, and insightful evaluations of WebAssembly engines.

Wasure: A Modular Toolkit for Comprehensive WebAssembly Benchmarking

TL;DR

Wasure tackles the challenge of benchmarking WebAssembly engines across diverse runtimes, hardware, languages, and configurations by introducing a modular CLI toolkit that streamlines benchmark execution, comparison, and result handling. The approach emphasizes design goals of ease of use, extensibility, portability, and accuracy, supported by a three-phase pipeline (Preparation, Execution, Evaluation) and modular core components for benchmarks, runtimes, execution, results, visualization, and feature checks. Complementing performance evaluation, the paper presents a dynamic analysis of Wasure's benchmark suites using Wizard instrumentation, revealing substantial variation in coverage, hot paths, memory activity, and control-flow patterns across benchmark groups, and highlighting the need for both synthetic and realistic workloads. Collectively, Wasure enables more systematic, transparent, and insightful WebAssembly evaluations, with practical impact for researchers and developers designing and assessing Wasm engines and toolchains.

Abstract

WebAssembly (Wasm) has become a key compilation target for portable and efficient execution across diverse platforms. Benchmarking its performance, however, is a multi-dimensional challenge: it depends not only on the choice of runtime engines, but also on hardware architectures, application domains, source languages, benchmark suites, and runtime configurations. This paper introduces Wasure, a modular and extensible command-line toolkit that simplifies the execution and comparison of WebAssembly benchmarks. To complement performance evaluation, we also conducted a dynamic analysis of the benchmark suites included with Wasure. Our analysis reveals substantial differences in code coverage, control flow, and execution patterns, emphasizing the need for benchmark diversity. Wasure aims to support researchers and developers in conducting more systematic, transparent, and insightful evaluations of WebAssembly engines.
Paper Structure (52 sections, 14 figures, 4 tables)

This paper contains 52 sections, 14 figures, 4 tables.

Figures (14)

  • Figure 1: Example of a function compiled to WebAssembly.
  • Figure 2: Wasure core modules' pipeline
  • Figure 3: Mean dynamic instruction, block and function coverage across benchmark groups.
  • Figure 4: Normalized instruction hotness percentiles per benchmark. Each bar indicates the mean number of instructions required to account for 50%, 75%, 90%, 95%, and 100% of total execution time per benchmark group.
  • Figure 5: CDF of dynamic execution time covered by functions. The x-axis shows the fraction of total executed functions required to account for increasing percentages of dynamic execution time.
  • ...and 9 more figures