Learning Quantum-Samplers for Stochastic Processes with Quantum Sequence Models

Ximing Wang; Chengran Yang; Chidambaram Aditya Somasundaram; Jayne Thompson; Mile Gu

Learning Quantum-Samplers for Stochastic Processes with Quantum Sequence Models

Ximing Wang, Chengran Yang, Chidambaram Aditya Somasundaram, Jayne Thompson, Mile Gu

Abstract

Quantum circuits that generate coherent superpositions of stochastic processes are key to many downstream quantum-accelerated tasks, such as risk analysis, importance sampling, and DNA sequencing. However, traditional methods for designing such circuits from data face immense challenges, given the exponential growth in the size of the associated probability vectors as the desired simulation time horizon increases. Here, we introduce quantum sequence models that leverage a recurrent quantum circuit structure to generate coherent superpositions with circuit complexity that grows linearly with the desired time horizon; together with a recurrent variant of the parameter-shift rule, we train these models from observational data. When benchmarked against baseline quantum Born machines, our constructions exhibit orders-of-magnitude improvements in model accuracy in data-sparse regimes.

Learning Quantum-Samplers for Stochastic Processes with Quantum Sequence Models

Abstract

Paper Structure (8 sections, 3 theorems, 46 equations, 9 figures, 3 algorithms)

This paper contains 8 sections, 3 theorems, 46 equations, 9 figures, 3 algorithms.

Data collection from long sequences
Parameter Shift Rule for Recurrent Quantum Circuits
Background of the Parameter Shift Rule
Generalization of the Parameter Shift Rule
Evaluating Cost Functions
Error Analysis
Ansatzes for Benchmarking
Simplified Universal Quantum Circuits for Stochastic Process

Key Result

Theorem 1

Let $f$ be a linear expectation functional with associated cost functions $\lambda(x_{-M:L})$, the output $G$ of Algorithm alg:ParameterShift is an unbiased estimator of the gradient $\pdv{\theta}f(\mathcal{Q}_\theta)$.

Figures (9)

Figure 1: Quantum Sequence Models. A quantum sequence model $\mathcal{Q}_{\theta}$ consists of a memory $S$ and a sequence of output registers $\qty{\chi_t}$. Its dynamics is governed by a unitary $V(\vb*{\theta})$, which interacts $S$ with $\chi_t$ at each time-step $t$. Here, the $\vb*{\theta}$-dependence represents some parameterization of such sequence models. We say that $\mathcal{Q}_{\theta}$ is a quantum-sampler for a stochastic process $Q_{\theta}(\overleftarrow{X},\overrightarrow{X})$ if its resulting measurement statistics, when each $\chi_t$ is measured, are governed by $Q$. Our goal is to learn $\vb*{\theta}$ such that $Q$ closely approximates a given target stochastic process $P$.
Figure 2: Uniform renewal process. The dynamics of a uniform renewal process $P_N$ or order $N$ can be described as a hidden Markov model (HMM) with $S_0, S_1 \ldots S_{N-1}$ internal states - represented by blue nodes on the directed graph above. The edges between nodes then describe transitions between internal states, where an edge between $S_j$ and $S_k$ with labels labeled by $x|p$ represents the statement that a model in state $S_j$ at time-step $t$ will output $x$ and transition to state $S_k$ with probability $p$. For the renewal process above, the HMM describes a binary sequence of mostly $0$s (no-tick) punctuated by $1$s (tick), where the number of $0$s between $1$s is uniformly distributed between $0$ and $N-1$. To enable such statistics, the $HMM$ must have $N$ states, as the probability of the next tick depends on how many time-steps have occurred since the previous tick.
Figure 3: Performance Benchmarks. (a) We compare three quantum models in terms of KL-divergence rates: recurrent quantum circuit with 1-qubit memory (blue), recurrent quantum circuit with 2-qubit memory (green), and a non-recurrent circuit (orange) when tasked to learn uniform renewal processes of order $N$ from $3$ to $8$ when training data size $t = 50,000$. The true distortions are shown as the solid bars, while the empirical distortions are shown as crossed bars. Both 1-qubit and 2-qubit recurrent models (green) clearly outperform the non-recurrent baseline, even when the former does not fit training data as well. (b) The performance advantage is magnified in data-sparse settings where training data (x-axis) is lowered to $5000$ and $500$. While the non-recurrent baseline is able to fit training data almost perfectly (dashed line on right), its true distortion (solid line on left) becomes an order of magnitude higher than its recurrent counterparts.
Figure 4: One possible way to decomposition of a variational circuit $\mathcal{Q}(\theta_k)$ into $2n+1$ parts.
Figure 5: Given the weights $\lambda(x_{-M:L})$ that define a function $f$, the gradient can be evaluated using the procedure in the algorithm. At each iteration, a random step $i$ is selected, and the parameter $\theta$ in the $i$-th unitary $V(\theta)$ is shifted by $s\cdot \frac{\pi}{2}$, where $s$ is randomly chosen from $\qty{-1,+1}$. The quantum sample $\ket{\Phi}_{\overleftarrow{\chi}_M, \overrightarrow{\chi}_L,S}^{s,i}(\theta)$ is then measured in the computational basis to obtain the output string $x_{-M:L}$. The mean value $G = \mathbb{E}[(M+L) s \lambda(x_{-M:L})]$ is then an unbiased estimator of the gradient $\pdv{\theta} f(\ket{\Phi}_{\overleftarrow{\chi}_M, \overrightarrow{\chi}_L,S}(\theta))$.
...and 4 more figures

Theorems & Definitions (5)

Theorem 1: Recurrent Parameter Shift
Proposition 2
proof
Proposition 3
proof

Learning Quantum-Samplers for Stochastic Processes with Quantum Sequence Models

Abstract

Learning Quantum-Samplers for Stochastic Processes with Quantum Sequence Models

Authors

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (5)