HALO: A Fine-Grained Resource Sharing Quantum Operating System
John Zhuoyang Ye, Jiyuan Wang, Yifan Qiao, Jens Palsberg
TL;DR
HALO addresses the scarcity and queuing of quantum hardware by introducing fine-grained resource sharing and shot-aware scheduling to enable concurrent multi-process execution on a single device. It virtualizes data qubits, helper qubits, and shots, then uses a hardware-aware data-qubit space manager and a round-robin instruction scheduler to co-locate processes while enforcing isolation via resets of shared ancillas. The results on IBM Torino show up to 2.44x improvements in space utilization and 4.44x in throughput, with fidelity degradation kept within ~33% under aggressive sharing, illustrating a practical space-time trade-off for near-term devices. HALO demonstrates that carefully designed qubit sharing and temporal multiplexing can substantially increase quantum hardware utilization and reduce queue times for helper-qubit–intensive workloads, informing the design of future quantum cloud platforms and fault-tolerant architectures.
Abstract
As quantum computing enters the cloud era, thousands of users must share access to a small number of quantum processors. Users need to wait minutes to days to start their jobs, which only takes a few seconds for execution. Current quantum cloud platforms employ a fair-share scheduler, as there is no way to multiplex a quantum computer among multiple programs at the same time, leaving many qubits idle and significantly under-utilizing the hardware. This imbalance between high user demand and scarce quantum resources has become a key barrier to scalable and cost-effective quantum computing. We present HALO, the first quantum operating system design that supports fine-grained resource-sharing. HALO introduces two complementary mechanisms. First, a hardware-aware qubit-sharing algorithm that places shared helper qubits on regions of the quantum computer that minimize routing overhead and avoid cross-talk noise between different users' processes. Second, a shot-adaptive scheduler that allocates execution windows according to each job's sampling requirements, improving throughput and reducing latency. Together, these mechanisms transform the way quantum hardware is scheduled and achieve more fine-grained parallelism. We evaluate HALO on the IBM Torino quantum computer on helper qubit intense benchmarks. Compared to state-of-the-art systems such as HyperQ, HALO improves overall hardware utilization by up to 2.44x, increasing throughput by 4.44x, and maintains fidelity loss within 33%, demonstrating the practicality of resource-sharing in quantum computing.
