Table of Contents
Fetching ...

A fast and frugal Gaussian Boson Sampling emulator

Tom Dodd, Javier Martínez-Cifuentes, Oliver Thomson Brown, Nicolás Quesada, Raúl García-Patrón

TL;DR

This work presents a cumulant-expansion based classical emulator for Gaussian Boson Sampling that targets noisy, practical devices. By pre-computing cumulants up to a fixed order $K$ from the ground-truth Gaussian covariance and then sampling via a chain-rule with approximated marginals, the method achieves polynomial scaling in the number of modes $M$, with demonstrated efficiency on 100+ mode benchmarks. The emulator shows strong agreement with, and in some cases surpasses, ground-truth statistics from Jiuzhang experiments across multiple configurations, while using substantially less memory and a single CPU/GPU cluster. The approach generalizes to other binary-output distributions, offering a potential classical surrogate for quantum-classical hybrid algorithms and informing hardware design to preserve high-order cumulants under realistic imperfections. It also opens avenues for extension to photo-counting and single-photon scenarios and supports scalable testing as Gaussian-based quantum simulations grow in size. $K$-bounded cumulants and the corresponding marginals provide a flexible, implementable framework for simulating large-scale sampling tasks in photonic quantum devices.

Abstract

If classical algorithms have been successful in reproducing the estimation of expectation values of observables of some quantum circuits using off-the-shelf computing resources, matching the performance of the most advanced quantum devices on sampling problems usually requires extreme cost in terms of memory and computing operations, making them accessible to only a handful of supercomputers around the world. In this work, we demonstrate for the first time a classical simulation outperforming Gaussian boson sampling experiments of one hundred modes on established benchmark tests using a single CPU or GPU. Being embarrassingly parallelizable, a small number of CPUs or GPUs allows us to match previous sampling rates that required more than one hundred GPUs. We believe algorithmic and implementation improvements will generalize our tools to photo-counting, single-photon inputs, and pseudo-photon-number-resolving scenarios beyond one thousand modes. Finally, most of the innovations in our tools remain valid for generic probability distributions over binary variables, rendering it potentially applicable to the simulation of qubit-based sampling problems and creating classical surrogates for classical-quantum algorithms.

A fast and frugal Gaussian Boson Sampling emulator

TL;DR

This work presents a cumulant-expansion based classical emulator for Gaussian Boson Sampling that targets noisy, practical devices. By pre-computing cumulants up to a fixed order from the ground-truth Gaussian covariance and then sampling via a chain-rule with approximated marginals, the method achieves polynomial scaling in the number of modes , with demonstrated efficiency on 100+ mode benchmarks. The emulator shows strong agreement with, and in some cases surpasses, ground-truth statistics from Jiuzhang experiments across multiple configurations, while using substantially less memory and a single CPU/GPU cluster. The approach generalizes to other binary-output distributions, offering a potential classical surrogate for quantum-classical hybrid algorithms and informing hardware design to preserve high-order cumulants under realistic imperfections. It also opens avenues for extension to photo-counting and single-photon scenarios and supports scalable testing as Gaussian-based quantum simulations grow in size. -bounded cumulants and the corresponding marginals provide a flexible, implementable framework for simulating large-scale sampling tasks in photonic quantum devices.

Abstract

If classical algorithms have been successful in reproducing the estimation of expectation values of observables of some quantum circuits using off-the-shelf computing resources, matching the performance of the most advanced quantum devices on sampling problems usually requires extreme cost in terms of memory and computing operations, making them accessible to only a handful of supercomputers around the world. In this work, we demonstrate for the first time a classical simulation outperforming Gaussian boson sampling experiments of one hundred modes on established benchmark tests using a single CPU or GPU. Being embarrassingly parallelizable, a small number of CPUs or GPUs allows us to match previous sampling rates that required more than one hundred GPUs. We believe algorithmic and implementation improvements will generalize our tools to photo-counting, single-photon inputs, and pseudo-photon-number-resolving scenarios beyond one thousand modes. Finally, most of the innovations in our tools remain valid for generic probability distributions over binary variables, rendering it potentially applicable to the simulation of qubit-based sampling problems and creating classical surrogates for classical-quantum algorithms.

Paper Structure

This paper contains 39 sections, 62 equations, 5 figures.

Figures (5)

  • Figure 1: (1) An instance of GBS is characterized by its ideal optical circuit, or equivalently the covariance matrix of its associated Gaussian state; (2) An experiment implements a version affected by losses and other imperfections; (3) One can extract the ground truth of the experiment from tomographic data or applying a good model of imperfections to the noiseless GBS instance; (4) 10 million samples are generated by quantum hardware in approximately 10 seconds; (5) The samples are benchmarked using cumulant statistical analysis and XEB; (6) The first stage of our algorithm computes a list of $\binom{144}{5}$ correlators that are later transformed into cumulants from the ground truth covariance matrix, as detailed in section \ref{['subsec:Training']}; (7) The second stage generates samples from the corresponding cumulants as detailed in section \ref{['subsec:Algorithm']}; (8) Using 8 GPUs, we can generate 10 million samples in 115 minutes, later validated with the same benchmarking tests.
  • Figure 2: Validation tests for the J2-P65-2 experiment. (a) Comparison between the estimated click cumulants (orders 2 to 5), from experimental (dark blue circles) and double-elision samples with $K=5$ (yellow circles), and those predicted by the ground truth of the experiment (light green line). The vertical axis corresponds to cumulants obtained from samples, either experimental or double elision, while the horizontal axis corresponds to the ground truth (i.e., theoretical) cumulants. The dark blue and yellow lines represent the linear fit between sample cumulants and those from the ground truth. The slope of each linear fit is shown in the legends. All cumulants were estimated using $10^7$ samples. (b) Pearson correlation coefficients between sample and ground truth cumulants. The light green dashed line indicates the ideal value of these coefficients, which is equal to 1. Error bars were obtained using 100 bootstrapping resamples. (c) XEB as a function of the total number of clicks $C$, for both experimental (dark blue dashed line with circles) and double-elision (yellow dashed line with circles) samples. For reference, we also include the XEB scores of the squashed states model martinez2023classical (purple dashed line with circles). Error bars were obtained through error propagation. The estimation of the XEB scores used $4000$ samples per number of clicks. The inset figure shows the total click distribution, $p(C)$, of the sampler. The shaded region indicates the range of total number of clicks, $C$, for which the XEB scores were computed.
  • Figure 3: Validation tests for the J2-P65-5 experiment. (a) Comparison between the estimated click cumulants (orders 2 to 5), from experimental (dark blue circles) and double-elision samples with $K=5$ (yellow circles), and those predicted by the ground truth of the experiment (light green line). The vertical axis corresponds to cumulants obtained from samples, either experimental or double-elision, while the horizontal axis corresponds to the ground truth (i.e., theoretical) cumulants. The dark blue and yellow lines represent the linear fit between sample cumulants and those from the ground truth. The slope of each linear fit is shown in the legends. All cumulants were estimated using $10^7$ samples. (b) Pearson correlation coefficients between sample and ground truth cumulants. The light green dashed line indicates the ideal value of these coefficients, which is equal to 1. Error bars were obtained using 100 bootstrapping resamples. (c) Total click distribution as a function of the total number of clicks $C$, for both experimental (dark blue circles) and double-elision (yellow circles) samples. The ground truth distribution corresponds to the light green dashed line with circles, which was obtained using phase space techniques drummond2022simulating. Error bars were obtained through bootstrapping.
  • Figure 4: Validation tests for the J3-high experiment. (a) Comparison between the estimated click cumulants (orders 2 to 5), from experimental (dark blue circles) and double-elision samples with $K=5$ (yellow circles), and those predicted by the ground truth of the experiment (light green line). The vertical axis corresponds to cumulants obtained from samples, either experimental or double-elision, while the horizontal axis corresponds to the ground truth (i.e., theoretical) cumulants. The dark blue and yellow lines represent the linear fit between sample cumulants and those from the ground truth. The slope of each linear fit is shown in the legends. All cumulants were estimated using $10^7$ samples. (b) Pearson correlation coefficients between sample and ground truth cumulants. The light green dashed line indicates the ideal value of these coefficients, which is equal to 1. Error bars were obtained using 100 bootstrapping resamples. (c) Total click distribution as a function of the total number of clicks $C$, for both experimental (dark blue circles) and double-elision (yellow circles) samples. The ground truth distribution corresponds to the light green dashed line with circles, and was obtained using phase space techniques drummond2022simulating. Error bars were obtained through bootstrapping. The estimation of the experimental and double-elision total click distributions was obtained using $10^7$ samples.
  • Figure 5: Double-elision algorithm cost estimation. (a) Sampling time (in hours) as a function of the number of modes, $M$, for $K=4$ (red circles) and $K=5$ (teal circles). Each point indicates the time it takes to generate $10^7$ samples. (b) Sampling rate (in $\mathrm{s}^{-1}$) as a function of the number of cores for $K=4$ (red circles) and $K=5$ (teal circles). (c) Time of the Phase I, i.e. data-structure pre-processing, (in $\mathrm{s}$) as a function of the number of modes, $M$, for $K=4$ (red circles) and $K=5$ (teal circles). (d) Time of the Phase II pre-sampling preparation (in $\mathrm{s}$) as a function of the number of modes, $M$, for $K=4$ (red circles) and $K=5$ (teal circles). In Figs. (a) to (d), the dotted lines correspond to the linear or power law fits shown in the legends. In figures (a), (c) and (d), both the horizontal and vertical axes are in logarithmic scale. All data where obtained using randomly generated interferometers, except for (b) where we used the 144-mode interferometer of the J3-high configuration.