Table of Contents
Fetching ...

Virtuoso: Enabling Fast and Accurate Virtual Memory Research via an Imitation-based Operating System Simulation Methodology

Konstantinos Kanellopoulos, Konstantinos Sgouras, F. Nisa Bostanci, Andreas Kosmas Kakolyris, Berkin Kerim Konar, Rahul Bera, Mohammad Sadrosadati, Rakesh Kumar, Nandita Vijaykumar, Onur Mutlu

TL;DR

Virtuoso tackles the challenge of evaluating VM hardware/software co-designs by introducing MimicOS, a lightweight, imitation-based OS kernel, integrated with architectural simulators through dual channels that separately capture functional events and OS-induced overheads. This approach yields a fast yet accurate platform (VirTool) for exploring VM designs across microarchitectural and OS layers, validated against a real server and demonstrated through five VM use cases. Key findings include substantial gains in IPC accuracy over baseline emulation, reasonable PF latency modeling, and the ability to study diverse MMU and PT designs with pronounced insight into trade-offs between translation latency, memory interference, and swapping behavior. The work significantly lowers the barrier to rapidly prototyping and comparing VM improvements, bridging the gap between fast emulation and full-system simulation, and is openly available for the research community to extend. Virtuoso thus provides a practical, extensible platform for hardware/OS co-design in virtual memory research with broad applicability to future systems and workloads.

Abstract

The unprecedented growth in data demand from emerging applications has turned virtual memory (VM) into a major performance bottleneck. Researchers explore new hardware/OS co-designs to optimize VM across diverse applications and systems. To evaluate such designs, researchers rely on various simulation methodologies to model VM components.Unfortunately, current simulation tools (i) either lack the desired accuracy in modeling VM's software components or (ii) are too slow and complex to prototype and evaluate schemes that span across the hardware/software boundary. We introduce Virtuoso, a new simulation framework that enables quick and accurate prototyping and evaluation of the software and hardware components of the VM subsystem. The key idea of Virtuoso is to employ a lightweight userspace OS kernel, called MimicOS, that (i) accelerates simulation time by imitating only the desired kernel functionalities, (ii) facilitates the development of new OS routines that imitate real ones, using an accessible high-level programming interface, (iii) enables accurate and flexible evaluation of the application- and system-level implications of VM after integrating Virtuoso to a desired architectural simulator. We integrate Virtuoso into five diverse architectural simulators, each specializing in different aspects of system design, and heavily enrich it with multiple state-of-the-art VM schemes. Our validation shows that Virtuoso ported on top of Sniper, a state-of-the-art microarchitectural simulator, models the memory management unit of a real high-end server-grade page fault latency of a real Linux kernel with high accuracy . Consequently, Virtuoso models the IPC performance of a real high-end server-grade CPU with 21% higher accuracy than the baseline version of Sniper. The source code of Virtuoso is freely available at https://github.com/CMU-SAFARI/Virtuoso.

Virtuoso: Enabling Fast and Accurate Virtual Memory Research via an Imitation-based Operating System Simulation Methodology

TL;DR

Virtuoso tackles the challenge of evaluating VM hardware/software co-designs by introducing MimicOS, a lightweight, imitation-based OS kernel, integrated with architectural simulators through dual channels that separately capture functional events and OS-induced overheads. This approach yields a fast yet accurate platform (VirTool) for exploring VM designs across microarchitectural and OS layers, validated against a real server and demonstrated through five VM use cases. Key findings include substantial gains in IPC accuracy over baseline emulation, reasonable PF latency modeling, and the ability to study diverse MMU and PT designs with pronounced insight into trade-offs between translation latency, memory interference, and swapping behavior. The work significantly lowers the barrier to rapidly prototyping and comparing VM improvements, bridging the gap between fast emulation and full-system simulation, and is openly available for the research community to extend. Virtuoso thus provides a practical, extensible platform for hardware/OS co-design in virtual memory research with broad applicability to future systems and workloads.

Abstract

The unprecedented growth in data demand from emerging applications has turned virtual memory (VM) into a major performance bottleneck. Researchers explore new hardware/OS co-designs to optimize VM across diverse applications and systems. To evaluate such designs, researchers rely on various simulation methodologies to model VM components.Unfortunately, current simulation tools (i) either lack the desired accuracy in modeling VM's software components or (ii) are too slow and complex to prototype and evaluate schemes that span across the hardware/software boundary. We introduce Virtuoso, a new simulation framework that enables quick and accurate prototyping and evaluation of the software and hardware components of the VM subsystem. The key idea of Virtuoso is to employ a lightweight userspace OS kernel, called MimicOS, that (i) accelerates simulation time by imitating only the desired kernel functionalities, (ii) facilitates the development of new OS routines that imitate real ones, using an accessible high-level programming interface, (iii) enables accurate and flexible evaluation of the application- and system-level implications of VM after integrating Virtuoso to a desired architectural simulator. We integrate Virtuoso into five diverse architectural simulators, each specializing in different aspects of system design, and heavily enrich it with multiple state-of-the-art VM schemes. Our validation shows that Virtuoso ported on top of Sniper, a state-of-the-art microarchitectural simulator, models the memory management unit of a real high-end server-grade page fault latency of a real Linux kernel with high accuracy . Consequently, Virtuoso models the IPC performance of a real high-end server-grade CPU with 21% higher accuracy than the baseline version of Sniper. The source code of Virtuoso is freely available at https://github.com/CMU-SAFARI/Virtuoso.
Paper Structure (33 sections, 21 figures, 5 tables)

This paper contains 33 sections, 21 figures, 5 tables.

Figures (21)

  • Figure 1: Fraction of total execution time spent in address translation and physical memory allocation in long-running and short-running workloads executed on a real high-end server systemkratos20.
  • Figure 2: Minor page fault latency distribution across two different physical memory allocation policies (i.e., THP corbet2011corbet2017 enabled and disabled) measured in a real system kratos20.
  • Figure 3: Average PTW latency across 53 different applications that exhibit varying levels of memory intensity, measured in a real high-end server system kratos20.
  • Figure 4: Overview of Virtuoso's Architecture.
  • Figure 5: Example page fault handling workflow of Virtuoso coupled with an architectural simulator.
  • ...and 16 more figures