Table of Contents
Fetching ...

Out-of-distribution generalisation for learning quantum channels with low-energy coherent states

Jason L. Pereira, Quntao Zhuang, Leonardo Banchi

TL;DR

This work establishes a general theoretical framework for out-of-distribution generalisation in learning continuous-variable quantum channels from low-energy coherent-state probes. It proves that small in-distribution errors bound the channel distance for all input states, via a concave bound ε(ε0,r^2) and a three-stage construction linking in-distribution performance to high-energy and non-classical inputs. The authors derive explicit bounds for representative channel classes (Gaussian, cubic phase, Kerr) and for various nonclassical input states (SPATS, Fock, squeezed vacuums), while outlining tightness considerations and practical guidance for applying the bounds. They also connect the results to quantum process tomography, metrology, and machine learning paradigms, and discuss extensions to multi-mode settings and potential implications for quantum discrimination tasks. Overall, the findings quantify how experimentally accessible, low-energy coherent probes suffice to predict and bound channel behaviour across the full CV landscape, given adequate sampling.

Abstract

When experimentally learning the action of a continuous variable quantum process by probing it with inputs, there will often be some restriction on the input states used. One experimentally simple way to probe a quantum channel is using low energy coherent states. Learning a quantum channel in this way presents difficulties, due to the fact that two channels may act similarly on low energy inputs but very differently for high energy inputs. They may also act similarly on coherent state inputs but differently on non-classical inputs. Extrapolating the behaviour of a channel for more general input states from its action on the far more limited set of low energy coherent states is a case of out-of-distribution generalisation. To be sure that such generalisation gives meaningful results, one needs to relate error bounds for the training set to bounds that are valid for all inputs. We show that for any pair of channels that act sufficiently similarly on low energy coherent state inputs, one can bound how different the input-output relations are for any (high energy or highly non-classical) input. This proves out-of-distribution generalisation is always possible for learning quantum channels using low energy coherent states, as long as enough samples are used.

Out-of-distribution generalisation for learning quantum channels with low-energy coherent states

TL;DR

This work establishes a general theoretical framework for out-of-distribution generalisation in learning continuous-variable quantum channels from low-energy coherent-state probes. It proves that small in-distribution errors bound the channel distance for all input states, via a concave bound ε(ε0,r^2) and a three-stage construction linking in-distribution performance to high-energy and non-classical inputs. The authors derive explicit bounds for representative channel classes (Gaussian, cubic phase, Kerr) and for various nonclassical input states (SPATS, Fock, squeezed vacuums), while outlining tightness considerations and practical guidance for applying the bounds. They also connect the results to quantum process tomography, metrology, and machine learning paradigms, and discuss extensions to multi-mode settings and potential implications for quantum discrimination tasks. Overall, the findings quantify how experimentally accessible, low-energy coherent probes suffice to predict and bound channel behaviour across the full CV landscape, given adequate sampling.

Abstract

When experimentally learning the action of a continuous variable quantum process by probing it with inputs, there will often be some restriction on the input states used. One experimentally simple way to probe a quantum channel is using low energy coherent states. Learning a quantum channel in this way presents difficulties, due to the fact that two channels may act similarly on low energy inputs but very differently for high energy inputs. They may also act similarly on coherent state inputs but differently on non-classical inputs. Extrapolating the behaviour of a channel for more general input states from its action on the far more limited set of low energy coherent states is a case of out-of-distribution generalisation. To be sure that such generalisation gives meaningful results, one needs to relate error bounds for the training set to bounds that are valid for all inputs. We show that for any pair of channels that act sufficiently similarly on low energy coherent state inputs, one can bound how different the input-output relations are for any (high energy or highly non-classical) input. This proves out-of-distribution generalisation is always possible for learning quantum channels using low energy coherent states, as long as enough samples are used.

Paper Structure

This paper contains 32 sections, 6 theorems, 136 equations, 7 figures, 1 table.

Key Result

Lemma 1

Suppose $\|(\Psi-\Phi)[| r e^{i\phi} \rangle \langle r e^{i\phi} |_{\mathrm{coh}}]\| = 0$ for all $\phi$ and all $r\leq \tau$, for some $\tau>0$. Then, for all $\phi$ and all $r$, $\|(\Psi-\Phi)[| r e^{i\phi} \rangle \langle r e^{i\phi} |_{\mathrm{coh}}]\| = 0$.

Figures (7)

  • Figure 1: Parts (a) and (b) show two different ways we could learn a channel, $\Phi$, that mimics the action of a target process, $\Psi$, for low energy coherent states. (a) is "classical" learning of a quantum channel, whilst (b) is a form of quantum machine learning. In this illustration, we depict probing the optical properties of an unknown (green) sample, although the process could be a magnetic field, a non-linear medium, a reflective cavity, or any other transformation that could be applied to a state of light. We search through a set of possible (yellow) optical mediums to find the one that best imitates the green substance. In (a), we use a (finite energy) laser to send low energy coherent states through the target and characterise the outputs with measurements. The results are post-processed (e.g., by feeding them into a classical computer, per the diagram) to obtain classical knowledge of the input-output relations. We can then "simulate" the action of $\Psi$ on an unknown state, using our classical computer and the transformation $\Phi$. In (b), we have a tuneable quantum device (such as a different substance with tuneable optical properties, a tuneable magnetic field, or some parametrised optical circuit) that we want to use to imitate the target process. We probe both processes with the same coherent states (we show two lasers, but a real implementation could use a single laser and a balanced beamsplitter, to ensure the probing states are identical). Instead of characterising both outputs, we measure them jointly, to determine their trace distance. We could then use the measurement results to tune the quantum device and so reduce the distance. Once $\Phi$ is tuned, we hope that an unknown state sent through the quantum device will be transformed similarly to if it were sent through $\Psi$. Parts (c-e) illustrate the three types of input for which we can bound the closeness of two channels. (c) corresponds to the input states for parts (a) and (b); the probes used in our physical measurements are low energy coherent states. The dotted line shows the maximum average energy of the inputs. We can construct a bound on the output distance between our target and learned processes, $\Psi$ and $\Phi$, based on our measurement results. In (d), we consider higher energy coherent state inputs. These are outside of the class of inputs for which we have actual measurement results; rather, we want to be able to trust that our simulation, $\Phi$, of $\Psi$ is still accurate for these inputs (in the scenario of part (a)) or that our tuned quantum device still mimics our target well (in the scenario of part (b)). We use our bound for the inputs shown in (c) to construct a bound on the inputs shown in (d). The scaling of this bound depends on the class of channels to which the target and learned channel belong. In (e), we consider other types of input states, including non-classical states such as squeezed vacuums and Fock states. We again want to trust that $\Phi$ is a good simulation of $\Psi$ for these types of input, and so we extend our bounds for the inputs in (d) to the inputs in (e).
  • Figure 2: Trace norm bounds as a function of the energy of the input coherent state using a variety of techniques. The red lines show the trivial, piecewise step function bound, the green lines show the bound for any pair of Gaussian channels (from Eq. (\ref{['eq: gaussian epsilon']})), and the blue lines show the bound when both channels are phase rotations (from Eq. (\ref{['eq: PR epsilon']})). In the plot on the left, we set $\epsilon_0=0.3$, whilst on the plot on the right, we set $\epsilon_0=0.1$. In both cases, $\tau^2=1$.
  • Figure 3: Output trace norm for a coherent state input for a pair of cubic phase unitaries (left) and for a pair of Kerr unitaries (right). The curves are plotted for various values of the parameter difference between the target and learned channels. In the case of cubic phase unitaries, the output trace norm appears to be an increasing function of the parameter difference for all values of $\bar{n}$, whilst for Kerr unitaries, we observe that many of the curves have inflection points and cross each other, so there are points at which a lower value of the parameter difference gives a higher output trace norm.
  • Figure 4: The minimum value of $\epsilon(\epsilon_0,\bar{n})$ for which the bound from Theorem \ref{['th: general states']} is non-trivial, when we fix $\frac{s(1-s)(M+1)}{1-2s}=\bar{n}$ (given on a logarithmic scale). For a given value of $\bar{n}$, we can choose any integer value for $M$ (indicated by the horizontal lines on the left plot). Then, the plot on the left can be interpreted as follows: if $\epsilon(\epsilon_0,\bar{n})$ (the bound on the output trace norm for classical inputs) is less than the value indicated by the colour, then the bound will be non-trivial. The white region (in the bottom right) does not admit a non-trivial bound (for this choice of relationship between $s$ and $M$), whilst the navy region (in the top left) has an extremely small value for the required $\epsilon(\epsilon_0,\bar{n})$. The plot on the right shows the same thing from a slightly different perspective: each curve is for a fixed value of $M$. As an example, by following the red, dashed line, we can see that for any input with average photon number $\bar{n} \leq 0.297$, we always have a non-trivial bound if $\epsilon(\epsilon_0,\bar{n})\leq 10^{-9}$.
  • Figure 5: The plot on the left shows the output trace norm, $T$, up to a factor of $\gamma$, for a pair of parity channels. The green line shows the exact value for SPATSs, whilst the orange and blue lines show the bounds coming from Eqs. (\ref{['eq: finite negativity bound pm']}) and (\ref{['eq: finite bound with mu']}) respectively. The red line shows the output trace norm for a coherent state of the same energy. Note that the x-axis tracks $q$, the average photon number of the thermal state before the photon addition, which is connected to the average photon number of the SPATS by $\bar{n}=1+2q$, so that $q=0$ corresponds to $\bar{n}=1$. The graph on the right plots the negativity of SPATSs against the ratio between the exact output trace norm and the upper bound from Eq. (\ref{['eq: finite negativity bound pm']}).
  • ...and 2 more figures

Theorems & Definitions (6)

  • Lemma 1
  • Theorem 2
  • Theorem 3
  • Corollary 3.1
  • Corollary 3.2
  • Corollary 3.3