QOS: A Quantum Operating System
Emmanouil Giortamis, Francisco Romão, Nathaniel Tornow, Pramod Bhatotia
TL;DR
Quantum hardware remains noisy, small, and heterogeneous, creating a demanding resource-management problem for running quantum workloads at scale. QOS addresses this by introducing a cross-layer cloud operating system built around the Qernel abstraction, with four synergistic components: an error mitigator, an estimator, a multi-programmer, and a fidelity-aware scheduler. The framework demonstrates substantial gains on real IBM devices, including up to $456.5\times$ fidelity improvement, up to $9.6\times$ higher utilization, and up to $5\times$ shorter waiting times with minimal fidelity sacrifice. This holistic, modular approach enables scalable, resource-efficient quantum computation today and provides a foundation for future fault-tolerant extensions.
Abstract
Quantum computers face challenges due to hardware constraints, noise errors, and heterogeneity, and face fundamental design tradeoffs between key performance metrics such as \textit{quantum fidelity} and system utilization. This substantially complicates managing quantum resources to scale the size and number of quantum algorithms that can be executed reliably in a given time. We introduce QOS, a cloud operating system for managing quantum resources while mitigating their inherent limitations and balancing the design tradeoffs of quantum computing. QOS exposes a hardware-agnostic API for transparent quantum job execution, mitigates hardware errors, and systematically multi-programs and schedules the jobs across space and time to achieve high quantum fidelity in a resource-efficient manner. To achieve this, it leverages two key insights: First, to maximize utilization and minimize fidelity loss, some jobs are more compatible than others for multi-programming on the same quantum computer. Second, sacrificing minimal fidelity can significantly reduce job waiting times. We evaluate QOS on real quantum devices hosted by IBM, using 7000 real quantum runs of more than 70.000 benchmark instances. We show that the QOS achieves 2.6--456.5$\times$ higher fidelity, increases resource utilization by up to 9.6$\times$, and reduces waiting times by up to 5$\times$ while sacrificing only 1--3\% fidelity, on average, compared to the baselines.
