Table of Contents
Fetching ...

QOS: A Quantum Operating System

Emmanouil Giortamis, Francisco Romão, Nathaniel Tornow, Pramod Bhatotia

TL;DR

Quantum hardware remains noisy, small, and heterogeneous, creating a demanding resource-management problem for running quantum workloads at scale. QOS addresses this by introducing a cross-layer cloud operating system built around the Qernel abstraction, with four synergistic components: an error mitigator, an estimator, a multi-programmer, and a fidelity-aware scheduler. The framework demonstrates substantial gains on real IBM devices, including up to $456.5\times$ fidelity improvement, up to $9.6\times$ higher utilization, and up to $5\times$ shorter waiting times with minimal fidelity sacrifice. This holistic, modular approach enables scalable, resource-efficient quantum computation today and provides a foundation for future fault-tolerant extensions.

Abstract

Quantum computers face challenges due to hardware constraints, noise errors, and heterogeneity, and face fundamental design tradeoffs between key performance metrics such as \textit{quantum fidelity} and system utilization. This substantially complicates managing quantum resources to scale the size and number of quantum algorithms that can be executed reliably in a given time. We introduce QOS, a cloud operating system for managing quantum resources while mitigating their inherent limitations and balancing the design tradeoffs of quantum computing. QOS exposes a hardware-agnostic API for transparent quantum job execution, mitigates hardware errors, and systematically multi-programs and schedules the jobs across space and time to achieve high quantum fidelity in a resource-efficient manner. To achieve this, it leverages two key insights: First, to maximize utilization and minimize fidelity loss, some jobs are more compatible than others for multi-programming on the same quantum computer. Second, sacrificing minimal fidelity can significantly reduce job waiting times. We evaluate QOS on real quantum devices hosted by IBM, using 7000 real quantum runs of more than 70.000 benchmark instances. We show that the QOS achieves 2.6--456.5$\times$ higher fidelity, increases resource utilization by up to 9.6$\times$, and reduces waiting times by up to 5$\times$ while sacrificing only 1--3\% fidelity, on average, compared to the baselines.

QOS: A Quantum Operating System

TL;DR

Quantum hardware remains noisy, small, and heterogeneous, creating a demanding resource-management problem for running quantum workloads at scale. QOS addresses this by introducing a cross-layer cloud operating system built around the Qernel abstraction, with four synergistic components: an error mitigator, an estimator, a multi-programmer, and a fidelity-aware scheduler. The framework demonstrates substantial gains on real IBM devices, including up to fidelity improvement, up to higher utilization, and up to shorter waiting times with minimal fidelity sacrifice. This holistic, modular approach enables scalable, resource-efficient quantum computation today and provides a foundation for future fault-tolerant extensions.

Abstract

Quantum computers face challenges due to hardware constraints, noise errors, and heterogeneity, and face fundamental design tradeoffs between key performance metrics such as \textit{quantum fidelity} and system utilization. This substantially complicates managing quantum resources to scale the size and number of quantum algorithms that can be executed reliably in a given time. We introduce QOS, a cloud operating system for managing quantum resources while mitigating their inherent limitations and balancing the design tradeoffs of quantum computing. QOS exposes a hardware-agnostic API for transparent quantum job execution, mitigates hardware errors, and systematically multi-programs and schedules the jobs across space and time to achieve high quantum fidelity in a resource-efficient manner. To achieve this, it leverages two key insights: First, to maximize utilization and minimize fidelity loss, some jobs are more compatible than others for multi-programming on the same quantum computer. Second, sacrificing minimal fidelity can significantly reduce job waiting times. We evaluate QOS on real quantum devices hosted by IBM, using 7000 real quantum runs of more than 70.000 benchmark instances. We show that the QOS achieves 2.6--456.5 higher fidelity, increases resource utilization by up to 9.6, and reduces waiting times by up to 5 while sacrificing only 1--3\% fidelity, on average, compared to the baselines.
Paper Structure (28 sections, 12 figures, 1 table)

This paper contains 28 sections, 12 figures, 1 table.

Figures (12)

  • Figure 1: Foundational example (§ \ref{['sec:background:101']}) (a) Input graph to max-cut. (b) A quantum circuit encoding the max-cut formulation for the graph. (c) The execution result is a probability distribution of bitstrings. (d) The result is interpreted as a max-cut between vertices $\{a, d\}$ and $\{b,c,e\}$.
  • Figure 2: Technical Foundations (§ \ref{['sec:background:foundations']}) (a) The quantum circuit of Figure \ref{['fig:qaoa-background']}. (b) The physical layout of an IBM Falcon QPU. (c) The transpiled circuit with the QPU's noise sources.
  • Figure 3: (a) Challenge #1, Fidelity (§ \ref{['sec:challenges:scalability']}). Impact of the number of qubits (circuit size) on fidelity. There is an average 98.9% reduction in fidelity from 4 to 24 qubits. (b) Challenge #2, Spatial heterogeneity (§ \ref{['sec:challenges:spatial_and_temporal']}). Fidelity of a 12-qubit GHZ circuit on different IBM QPUs. There is a 38% fidelity difference from best to worst QPU.
  • Figure 4: (a) Challenge #2, Temporal variance (§ \ref{['sec:challenges:spatial_and_temporal']}). Fidelity of a 6-qubit GHZ circuit on IBM Perth, across 120 calibration days. There are 20 pairs of days with more than 5% difference in fidelity. (b) Challenge #3, Utilization (§ \ref{['sec:challenges:utilization']}). Maximum utilization achieved on a 27-qubit QPU for nine benchmarks while maintaining at least 0.75 fidelity. The average utilization is $26.3\%$, and the max is $29.6\%$. (c) Challenge #4, QPU Load (§ \ref{['sec:challenges:qpu_load']}). Number of pending jobs on different IBM QPUs. The groups separated by vertical red lines indicate QPUs of the same size. There is up to 57$\times$ difference in number of jobs between QPUs of the same size.
  • Figure 5: QOS overview (§ \ref{['sec:overview:overview']}): QOS consists of four main components: the error mitigator, estimator, multi-programmer, and scheduler.
  • ...and 7 more figures