A supplemental investigation of non-linearity in quantum generative models with respect to simulatability and optimization
Kaitlin Gili, Rohan S. Kumar, Mykolas Sveistrys, C. J. Ballance
TL;DR
The paper investigates whether introducing non-linearity via Repeat-Until-Success subroutines in quantum generative models yields two outcomes: potential classical simulability under the deferred measurement principle and effects on training stability. Using the Quantum Neuron Born Machine (QNBM), the authors compare a nonlinear QNBM to a linearized version that maps to Bayesian networks and show the nonlinear model is not trivially classically simulatable. They conduct training experiments on three distributions with a (5,0,6) architecture, observing that optimization can be unstable and dataset-dependent, with some tasks showing strong performance in best trials but high variance across trials. The results suggest non-linearity in quantum generative models remains a double-edged tool—not universally classically simulatable, but potentially destabilizing—prompting further work on inductive biases and task-aligned datasets.
Abstract
Recent work has demonstrated the utility of introducing non-linearity through repeat-until-success (RUS) sub-routines into quantum circuits for generative modeling. As a follow-up to this work, we investigate two questions of relevance to the quantum algorithms and machine learning communities: Does introducing this form of non-linearity make the learning model classically simulatable due to the deferred measurement principle? And does introducing this form of non-linearity make the overall model's training more unstable? With respect to the first question, we demonstrate that the RUS sub-routines do not allow us to trivially map this quantum model to a classical one, whereas a model without RUS sub-circuits containing mid-circuit measurements could be mapped to a classical Bayesian network due to the deferred measurement principle of quantum mechanics. This strongly suggests that the proposed form of non-linearity makes the model classically in-efficient to simulate. In the pursuit of the second question, we train larger models than previously shown on three different probability distributions, one continuous and two discrete, and compare the training performance across multiple random trials. We see that while the model is able to perform exceptionally well in some trials, the variance across trials with certain datasets quantifies its relatively poor training stability.
