Table of Contents
Fetching ...

HALO: A Fine-Grained Resource Sharing Quantum Operating System

John Zhuoyang Ye, Jiyuan Wang, Yifan Qiao, Jens Palsberg

TL;DR

HALO addresses the scarcity and queuing of quantum hardware by introducing fine-grained resource sharing and shot-aware scheduling to enable concurrent multi-process execution on a single device. It virtualizes data qubits, helper qubits, and shots, then uses a hardware-aware data-qubit space manager and a round-robin instruction scheduler to co-locate processes while enforcing isolation via resets of shared ancillas. The results on IBM Torino show up to 2.44x improvements in space utilization and 4.44x in throughput, with fidelity degradation kept within ~33% under aggressive sharing, illustrating a practical space-time trade-off for near-term devices. HALO demonstrates that carefully designed qubit sharing and temporal multiplexing can substantially increase quantum hardware utilization and reduce queue times for helper-qubit–intensive workloads, informing the design of future quantum cloud platforms and fault-tolerant architectures.

Abstract

As quantum computing enters the cloud era, thousands of users must share access to a small number of quantum processors. Users need to wait minutes to days to start their jobs, which only takes a few seconds for execution. Current quantum cloud platforms employ a fair-share scheduler, as there is no way to multiplex a quantum computer among multiple programs at the same time, leaving many qubits idle and significantly under-utilizing the hardware. This imbalance between high user demand and scarce quantum resources has become a key barrier to scalable and cost-effective quantum computing. We present HALO, the first quantum operating system design that supports fine-grained resource-sharing. HALO introduces two complementary mechanisms. First, a hardware-aware qubit-sharing algorithm that places shared helper qubits on regions of the quantum computer that minimize routing overhead and avoid cross-talk noise between different users' processes. Second, a shot-adaptive scheduler that allocates execution windows according to each job's sampling requirements, improving throughput and reducing latency. Together, these mechanisms transform the way quantum hardware is scheduled and achieve more fine-grained parallelism. We evaluate HALO on the IBM Torino quantum computer on helper qubit intense benchmarks. Compared to state-of-the-art systems such as HyperQ, HALO improves overall hardware utilization by up to 2.44x, increasing throughput by 4.44x, and maintains fidelity loss within 33%, demonstrating the practicality of resource-sharing in quantum computing.

HALO: A Fine-Grained Resource Sharing Quantum Operating System

TL;DR

HALO addresses the scarcity and queuing of quantum hardware by introducing fine-grained resource sharing and shot-aware scheduling to enable concurrent multi-process execution on a single device. It virtualizes data qubits, helper qubits, and shots, then uses a hardware-aware data-qubit space manager and a round-robin instruction scheduler to co-locate processes while enforcing isolation via resets of shared ancillas. The results on IBM Torino show up to 2.44x improvements in space utilization and 4.44x in throughput, with fidelity degradation kept within ~33% under aggressive sharing, illustrating a practical space-time trade-off for near-term devices. HALO demonstrates that carefully designed qubit sharing and temporal multiplexing can substantially increase quantum hardware utilization and reduce queue times for helper-qubit–intensive workloads, informing the design of future quantum cloud platforms and fault-tolerant architectures.

Abstract

As quantum computing enters the cloud era, thousands of users must share access to a small number of quantum processors. Users need to wait minutes to days to start their jobs, which only takes a few seconds for execution. Current quantum cloud platforms employ a fair-share scheduler, as there is no way to multiplex a quantum computer among multiple programs at the same time, leaving many qubits idle and significantly under-utilizing the hardware. This imbalance between high user demand and scarce quantum resources has become a key barrier to scalable and cost-effective quantum computing. We present HALO, the first quantum operating system design that supports fine-grained resource-sharing. HALO introduces two complementary mechanisms. First, a hardware-aware qubit-sharing algorithm that places shared helper qubits on regions of the quantum computer that minimize routing overhead and avoid cross-talk noise between different users' processes. Second, a shot-adaptive scheduler that allocates execution windows according to each job's sampling requirements, improving throughput and reducing latency. Together, these mechanisms transform the way quantum hardware is scheduled and achieve more fine-grained parallelism. We evaluate HALO on the IBM Torino quantum computer on helper qubit intense benchmarks. Compared to state-of-the-art systems such as HyperQ, HALO improves overall hardware utilization by up to 2.44x, increasing throughput by 4.44x, and maintains fidelity loss within 33%, demonstrating the practicality of resource-sharing in quantum computing.
Paper Structure (39 sections, 4 equations, 8 figures, 6 tables)

This paper contains 39 sections, 4 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Sharing helper qubits can in principle increase system throughput for applications need a lot of helper qubits.
  • Figure 2: The work flow diagram of HALO scheduler in our resource sharing quantum operating system. It takes user's processes as input and send scheduled quantum gate instructions to the quantum hardwares.
  • Figure 3: Example of HALO space management and instruction scheduling. HALO schedules a batch of two processes with $12$ qubits in total on a $10$ qubit quantum hardware(Impossible if not sharing helper qubits). Both $P1$ and $P_2$ have $3$ data qubits and $3$ helper qubits. We show three suboptimal data qubit layouts, and also the optimal data qubit layout. Right: The full scheduled instructions of the two process batch in a round robin order after HALO fix the data qubit layout mapping.
  • Figure 4: Left: The greedy cluster initial mapping of HALO for a batch with 10 processes on IBM Torino quantum processor. Right: Final mapping decision of HALO after 30 iterations of simulated annealing.
  • Figure 5: The average running time of HALO's space management with different number of processes in batch(§ \ref{['subsec:SpaceManager']}). The scheduling ends within $60$ seconds for almost all cases, much less than hours of waiting time in current quantum cloud service.
  • ...and 3 more figures