Table of Contents
Fetching ...

Low-Latency FPGA Control System for Real-Time Neural Network Processing in CCD-Based Trapped-Ion Qubit Measurement

Binglei Lou, Gautham Duddi Krishnaswaroop, Filip Wojcicki, Ruilin Wu, Richard Rademacher, Zhiqiang Que, Wayne Luk, Philip H. W. Leong

TL;DR

This work tackles real-time qubit state detection in trapped-ion quantum processors by benchmarking DNN-based detectors on FPGA and GPU. It introduces LUT-based MLP and Vision Transformer accelerators implemented on FPGA, achieving nanosecond to microsecond inference and substantial fidelity gains, especially for multi-qubit states, while revealing Cameralink readout as the main latency bottleneck. The study demonstrates that FPGA-based detection can be over 100x faster than GPU baselines for single-shot measurements and provides actionable insights into hardware bottlenecks and optimization paths. Collectively, the results offer a practical roadmap for ultra-low-latency qubit readout and guide future camera-interface and FPGA design choices for scalable quantum measurement systems.

Abstract

Accurate and low-latency qubit state measurement is critical for trapped-ion quantum computing. While deep neural networks (DNNs) have been integrated to enhance detection fidelity, their latency performance on specific hardware platforms remains underexplored. This work benchmarks the latency of DNN-based qubit detection on field-programmable gate arrays (FPGAs) and graphics processing units (GPUs). The FPGA solution directly interfaces an electron-multiplying charge-coupled device (EMCCD) with the subsequent data processing logic, eliminating buffering and interface overheads. As a baseline, the GPU-based system employs a high-speed PCIe image grabber for image input and I/O card for state output. We deploy Multilayer Perceptron (MLP) and Vision Transformer (ViT) models on hardware to evaluate measurement performance. Compared to conventional thresholding, DNNs reduce the mean measurement fidelity (MMF) error by factors of 1.8-2.5x (one-qubit case) and 4.2-7.6x (three-qubit case). FPGA-based MLP and ViT achieve nanosecond- and microsecond-scale inference latencies, while the complete single-shot measurement process achieves over 100x speedup compared to the GPU implementation. Additionally, clock-cycle-level signal analysis reveals inefficiencies in EMCCD data transmission via Cameralink, suggesting that optimizing this interface could further leverage the advantages of ultra-low-latency DNN inference, guiding the development of next-generation qubit detection systems.

Low-Latency FPGA Control System for Real-Time Neural Network Processing in CCD-Based Trapped-Ion Qubit Measurement

TL;DR

This work tackles real-time qubit state detection in trapped-ion quantum processors by benchmarking DNN-based detectors on FPGA and GPU. It introduces LUT-based MLP and Vision Transformer accelerators implemented on FPGA, achieving nanosecond to microsecond inference and substantial fidelity gains, especially for multi-qubit states, while revealing Cameralink readout as the main latency bottleneck. The study demonstrates that FPGA-based detection can be over 100x faster than GPU baselines for single-shot measurements and provides actionable insights into hardware bottlenecks and optimization paths. Collectively, the results offer a practical roadmap for ultra-low-latency qubit readout and guide future camera-interface and FPGA design choices for scalable quantum measurement systems.

Abstract

Accurate and low-latency qubit state measurement is critical for trapped-ion quantum computing. While deep neural networks (DNNs) have been integrated to enhance detection fidelity, their latency performance on specific hardware platforms remains underexplored. This work benchmarks the latency of DNN-based qubit detection on field-programmable gate arrays (FPGAs) and graphics processing units (GPUs). The FPGA solution directly interfaces an electron-multiplying charge-coupled device (EMCCD) with the subsequent data processing logic, eliminating buffering and interface overheads. As a baseline, the GPU-based system employs a high-speed PCIe image grabber for image input and I/O card for state output. We deploy Multilayer Perceptron (MLP) and Vision Transformer (ViT) models on hardware to evaluate measurement performance. Compared to conventional thresholding, DNNs reduce the mean measurement fidelity (MMF) error by factors of 1.8-2.5x (one-qubit case) and 4.2-7.6x (three-qubit case). FPGA-based MLP and ViT achieve nanosecond- and microsecond-scale inference latencies, while the complete single-shot measurement process achieves over 100x speedup compared to the GPU implementation. Additionally, clock-cycle-level signal analysis reveals inefficiencies in EMCCD data transmission via Cameralink, suggesting that optimizing this interface could further leverage the advantages of ultra-low-latency DNN inference, guiding the development of next-generation qubit detection systems.

Paper Structure

This paper contains 15 sections, 3 equations, 15 figures, 4 tables.

Figures (15)

  • Figure 1: In the block diagram of ML-aided qubit detection in a trapped ion system, photons scattered by the ions are collected and magnified by optical elements before being directed towards an EMCCD camera. The resulting images are transmitted to the acceleration hardware, such as an FPGA or GPU, via the Cameralink protocol. The right bar chart illustrates the speedup achieved by the DNN-based (ViT) measurement solution on a 3-qubit test.
  • Figure 2: This figure illustrates the framework of our qubit detection system. The camera is triggered by an ARTIQ Kasli system artiqKasli, and its images can be processed via either Kasli-based thresholding or FPGA-based DNN processing through the Cameralink protocol.
  • Figure 3: This figure describes the flow chart of the proposed FPGA-based qubit detection system. The steps with attached $\ast$ symbolize that probe signals are available in this step for latency measurements. The detailed timing diagram of FVAL, LVAL, tx_done and DNN_valid, DNN_data are illustraed in Fig. \ref{['fg:cameralink_timing']}.
  • Figure 4: The focus here is mainly on the relationships between FVAL, tx_done, and the DNN output, with other signal details being omitted for simplicity (We refer to Ref. cameralink for a specialized description of the Cameralink Protocol).
  • Figure 5: LUT-DNN Method.
  • ...and 10 more figures