Table of Contents
Fetching ...

Observability Architecture for Quantum-Centric Supercomputing Workflows

Naoki Kanazawa, Yuto Morohoshi, Hitomi Takahashi, Yukio Kawashima, Hiroshi Horii, Kengo Nakajima

TL;DR

This work tackles the challenge of observing quantum-centric supercomputing (QCSC) workflows, which combine probabilistic quantum kernels with large-scale classical orchestration and remote hardware. It proposes an application-level observability architecture organized around a workflow metrics pyramid, decoupled telemetry processing, and persistent storage to enable retrospective analysis without affecting primary execution. The authors implement this architecture on the Miyabi supercomputer and IBM Quantum systems, using Prefect for workflow orchestration and Apache Superset for dashboards, and demonstrate its utility through a closed-loop SQD workflow that employs differential evolution to study a [4Fe-4S] chemistry Hamiltonian. The results show how domain-level telemetry reveals solver dynamics and resource bottlenecks, enabling infrastructure-aware algorithm design and systematic experimentation in QCSC environments.

Abstract

Quantum-centric supercomputing (QCSC) workflows often involve hybrid classical-quantum algorithms that are inherently probabilistic and executed on remote quantum hardware, making them difficult to interpret and limiting the ability to monitor runtime performance and behavior. The high cost of quantum circuit execution and large-scale high-performance computing (HPC) infrastructure further restricts the number of feasible trials, making comprehensive evaluation of execution results essential for iterative development. We propose an observability architecture tailored for QCSC workflows that decouples telemetry collection from workload execution, enabling persistent monitoring across system and algorithmic layers and retaining detailed execution data for reproducible and retrospective analysis, eliminating redundant runs. Applied to a representative workflow involving sample-based quantum diagonalization, our system reveals solver behavior across multiple iterations. This approach enhances transparency and reproducibility in QCSC environments, supporting infrastructure-aware algorithm design and systematic experimentation.

Observability Architecture for Quantum-Centric Supercomputing Workflows

TL;DR

This work tackles the challenge of observing quantum-centric supercomputing (QCSC) workflows, which combine probabilistic quantum kernels with large-scale classical orchestration and remote hardware. It proposes an application-level observability architecture organized around a workflow metrics pyramid, decoupled telemetry processing, and persistent storage to enable retrospective analysis without affecting primary execution. The authors implement this architecture on the Miyabi supercomputer and IBM Quantum systems, using Prefect for workflow orchestration and Apache Superset for dashboards, and demonstrate its utility through a closed-loop SQD workflow that employs differential evolution to study a [4Fe-4S] chemistry Hamiltonian. The results show how domain-level telemetry reveals solver dynamics and resource bottlenecks, enabling infrastructure-aware algorithm design and systematic experimentation in QCSC environments.

Abstract

Quantum-centric supercomputing (QCSC) workflows often involve hybrid classical-quantum algorithms that are inherently probabilistic and executed on remote quantum hardware, making them difficult to interpret and limiting the ability to monitor runtime performance and behavior. The high cost of quantum circuit execution and large-scale high-performance computing (HPC) infrastructure further restricts the number of feasible trials, making comprehensive evaluation of execution results essential for iterative development. We propose an observability architecture tailored for QCSC workflows that decouples telemetry collection from workload execution, enabling persistent monitoring across system and algorithmic layers and retaining detailed execution data for reproducible and retrospective analysis, eliminating redundant runs. Applied to a representative workflow involving sample-based quantum diagonalization, our system reveals solver behavior across multiple iterations. This approach enhances transparency and reproducibility in QCSC environments, supporting infrastructure-aware algorithm design and systematic experimentation.

Paper Structure

This paper contains 9 sections, 5 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: (a) Workflow metrics pyramid showing categories of telemetry data and (b) schematic diagram of the QCSC observability architecture.
  • Figure 2: Component diagram of QCSC observability architecture implemented on the Miyabi supercomputer and mdx platform.
  • Figure 3: Timing diagram of the differential evolution algorithm workflow. The shaded box on the cli machine is implemented as a single task and executed in parallel for all populations using the Ray executor integration prefect-ray. Important SQD variables in Table \ref{['tab:sqd-telemetry']} are stored on the serv machine for post-hoc ETL pipeline execution.
  • Figure 4: Observability dashboard of domain-level (L4) metrics in the closed-loop SQD workflow. All panels except (f) show trends of various metrics toward an increase in DE iterations from two independent executions under the identical setup. The panels (f) show the average occupancy of each spatial orbital at different iterations of the first execution. The legend indicates the average occupancy of the 27th orbital, corresponding to the first unoccupied orbital in the RHF reference. See the main text for the details of each metric.
  • Figure 5: Observability dashboard of performance metrics (L3 and L2) from the first execution of the closed-loop SQD workflow. The top and middle box show statistical data of the L2 telemetries for 80 jobs in the QPU and HPC, respectively. The histograms represent performance distributions for these jobs. HPC job metrics are collected by the qstat -f command. Refer to the PBS reference guide altair_pbs_reference_2021 for the definitions. (a) QPU usage relative to the allocated time limit of 60,000 seconds. (b) Queueing time for QPU jobs. (c) QPU wall-clock time, including primitive payload compilation, execution, and post-processing. (d) HPC usage relative to the allocated token limit of 8,640 tokens. (e) Queueing time for HPC jobs, computed by stime - etime. (f) HPC wall-clock time, indicated by walltime. (g) Total virtual memory allocated across all concurrent processes within a job, indicated by resources_used.vmem. (h) Maximum CPU utilization rate, indicated by resources_used.cpupercent. (i) Execution time for dominant Prefect tasks.