Table of Contents
Fetching ...

Portable Targeted Sampling Framework Using LLVM

Zhantong Qiu, Mahyar Samani, Jason Lowe-Power

TL;DR

Nugget tackles the challenge of evaluating architectural ideas on realistic workloads by enabling portable, binary-independent sampling through LLVM IR interval analysis and cross-platform nuggets. The framework runs interval analysis on real hardware, generates portable nuggets with start and end markers, and validates samples on native hardware before employing them in simulation, dramatically reducing overhead and enabling cross-system comparisons. Its two main contributions are an efficient, hardware-friendly interval analysis workflow and a flexible nugget creation/validation pipeline that supports diverse sampling methodologies. This approach accelerates research iteration, enables more representative workload reduction, and provides practical cross-architecture validation for full-system simulations.

Abstract

Evaluating architectural ideas on realistic workloads is increasingly challenging due to the prohibitive cost of detailed simulation and the lack of portable sampling tools. Existing targeted sampling techniques are often tied to specific binaries, incur significant overhead, and make rapid validation across systems infeasible. To address these limitations, we introduce Nugget, a flexible framework that enables portable sampling across simulators, hardware, architectural differences, and libraries. Nugget leverages LLVM IR to perform binary-independent interval analysis, then generates lightweight, cross-platform executable snippets (nuggets), that can be validated natively on real hardware before use in simulation. This approach decouples samples from specific binaries, dramatically reduces analysis overhead, and allows researchers to iterate on sampling methodologies while efficiently validating samples across diverse systems.

Portable Targeted Sampling Framework Using LLVM

TL;DR

Nugget tackles the challenge of evaluating architectural ideas on realistic workloads by enabling portable, binary-independent sampling through LLVM IR interval analysis and cross-platform nuggets. The framework runs interval analysis on real hardware, generates portable nuggets with start and end markers, and validates samples on native hardware before employing them in simulation, dramatically reducing overhead and enabling cross-system comparisons. Its two main contributions are an efficient, hardware-friendly interval analysis workflow and a flexible nugget creation/validation pipeline that supports diverse sampling methodologies. This approach accelerates research iteration, enables more representative workload reduction, and provides practical cross-architecture validation for full-system simulations.

Abstract

Evaluating architectural ideas on realistic workloads is increasingly challenging due to the prohibitive cost of detailed simulation and the lack of portable sampling tools. Existing targeted sampling techniques are often tied to specific binaries, incur significant overhead, and make rapid validation across systems infeasible. To address these limitations, we introduce Nugget, a flexible framework that enables portable sampling across simulators, hardware, architectural differences, and libraries. Nugget leverages LLVM IR to perform binary-independent interval analysis, then generates lightweight, cross-platform executable snippets (nuggets), that can be validated natively on real hardware before use in simulation. This approach decouples samples from specific binaries, dramatically reduces analysis overhead, and allows researchers to iterate on sampling methodologies while efficiently validating samples across diverse systems.

Paper Structure

This paper contains 31 sections, 11 figures, 3 tables.

Figures (11)

  • Figure 1: Nugget pipeline. The process is divided into two main stages: (1) preparation and analysis, and (2) nugget selection and validation. The preparation and analysis stage, including base IR file creation and interval analysis execution, only needs to be performed once per program. Nugget creation and validation can be repeated to refine and select the most representative set of samples for the workload.
  • Figure 2: Interval analysis overhead via Nugget and functional simulation.
  • Figure 3: Interval analysis overhead via Nugget across different workloads.
  • Figure 4: Interval analysis overhead via Nugget across different numbers of threads.
  • Figure 5: Prediction error by machine and sampling method. All results shown are obtained from executing the nuggets directly on real hardware, without using simulation.
  • ...and 6 more figures