Platform Architecture for Tight Coupling of High-Performance Computing with Quantum Processors

Shane A. Caldwell; Moein Khazraee; Elena Agostini; Tom Lassiter; Corey Simpson; Omri Kahalon; Mrudula Kanuri; Jin-Sung Kim; Sam Stanwyck; Muyuan Li; Jan Olle; Christopher Chamberland; Ben Howe; Bruno Schmitt; Justin G. Lietz; Alex McCaskey; Jun Ye; Ang Li; Alicia B. Magann; Corey I. Ostrove; Kenneth Rudinger; Robin Blume-Kohout; Kevin Young; Nathan E. Miller; Yilun Xu; Gang Huang; Irfan Siddiqi; John Lange; Christopher Zimmer; Travis Humble

Platform Architecture for Tight Coupling of High-Performance Computing with Quantum Processors

Shane A. Caldwell, Moein Khazraee, Elena Agostini, Tom Lassiter, Corey Simpson, Omri Kahalon, Mrudula Kanuri, Jin-Sung Kim, Sam Stanwyck, Muyuan Li, Jan Olle, Christopher Chamberland, Ben Howe, Bruno Schmitt, Justin G. Lietz, Alex McCaskey, Jun Ye, Ang Li, Alicia B. Magann, Corey I. Ostrove, Kenneth Rudinger, Robin Blume-Kohout, Kevin Young, Nathan E. Miller, Yilun Xu, Gang Huang, Irfan Siddiqi, John Lange, Christopher Zimmer, Travis Humble

TL;DR

NVQLink presents a practical architecture that tightly couples HPC resources to QPU control systems to support online workloads like QEC, achieving sub-4$\mu s$ round-trips over a RoCE network and enabling real-time device callbacks via CUDA-Q. It introduces a robust programming model with device_call and device_ptr, a trait-based runtime, and an open compilation/execution flow that adapts across high- and low-latency regimes, including VPPU and PQPU simulation tools for offline development. The approach addresses QEC throughput and reaction-time requirements, highlights lattice-surgery-based scalable fault-tolerant execution, and discusses calibration and QCVV workloads that benefit from tight CPU/GPU co-processing. Together with the development tools and open specification, NVQLink aims to accelerate the path to fault-tolerant quantum computing by providing a scalable, vendor-agnostic platform for real-time quantum-classical co-processing.

Abstract

We propose an architecture, called NVQLink, for connecting high-performance computing (HPC) resources to the control system of a quantum processing unit (QPU) to accelerate workloads necessary to the operation of the QPU. We aim to support every physical modality of QPU and every type of QPU system controller (QSC). The HPC resource is optimized for real-time (latency-bounded) processing on tasks with latency tolerances of tens of microseconds. The network connecting the HPC and QSC is implemented on commercially available Ethernet and can be adopted relatively easily by QPU and QSC builders, and we report a round-trip latency measurement of 3.96 microseconds (max) with prospects of further optimization. We describe an extension to the CUDA-Q programming model and runtime architecture to support real-time callbacks and data marshaling between the HPC and QSC. By doing so, NVQLink extends heterogeneous, kernel-based programming to the QSC, allowing the programmer to address CPU, GPU, and FPGA subsystems in the QSC, all in the same C++ program, avoiding the use of a performance-limiting HTTP interface. We provide a pattern for QSC builders to integrate with this architecture by making use of multi-level intermediate representation dialects and progressive lowering to encapsulate QSC code.

Platform Architecture for Tight Coupling of High-Performance Computing with Quantum Processors

TL;DR

Abstract

Platform Architecture for Tight Coupling of High-Performance Computing with Quantum Processors

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)

Theorems & Definitions (2)