Parallelization of Adaptive Quantum Channel Discrimination in the Non-Asymptotic Regime

Bjarne Bergh; Nilanjana Datta; Robert Salzmann; Mark M. Wilde

Parallelization of Adaptive Quantum Channel Discrimination in the Non-Asymptotic Regime

Bjarne Bergh, Nilanjana Datta, Robert Salzmann, Mark M. Wilde

TL;DR

It is shown that all parallel strategies can be optimized over in time polynomial in the number of channel uses, and hence this result can be used to obtain a poly-time-computable asymptotically tight upper bound on the performance of general adaptive strategies.

Abstract

We investigate the performance of parallel and adaptive quantum channel discrimination strategies for a finite number of channel uses. It has recently been shown that, in the asymmetric setting with asymptotically vanishing type I error probability, adaptive strategies are asymptotically not more powerful than parallel ones. We extend this result to the non-asymptotic regime with finitely many channel uses, by explicitly constructing a parallel strategy for any given adaptive strategy, and bounding the difference in their performances, measured in terms of the decay rate of the type II error probability per channel use. We further show that all parallel strategies can be optimized over in time polynomial in the number of channel uses, and hence our result can also be used to obtain a poly-time-computable asymptotically tight upper bound on the performance of general adaptive strategies.

Parallelization of Adaptive Quantum Channel Discrimination in the Non-Asymptotic Regime

TL;DR

Abstract

Paper Structure (22 sections, 10 theorems, 176 equations, 3 figures)

This paper contains 22 sections, 10 theorems, 176 equations, 3 figures.

Introduction
Preliminaries, Notation, and Previous Results
Notation
Quantum Information Measures
Fidelity and Sine Distance
(Smoothed) Max-Divergence
Rényi Divergences
Channel Divergences
Quantum State Discrimination
Quantum Channel Discrimination
Asymptotic Equivalence of Adaptive and Parallel Strategies
Parallelizing an n-Shot Adaptive Protocol
Computability
A Simple One-Shot Version of the Chain Rule
Non-Asymptotic Bounds for the Smoothed Max-Relative Entropy
...and 7 more sections

Key Result

Lemma 1

Let $\rho, \sigma \in \mathcal{D}\left(\mathcal{H}\right)$ be quantum states. Then for all $\varepsilon \in [0, 1)$ where $h(\varepsilon)$ is the binary entropy function.

Figures (3)

Figure 1: Illustration of a general adaptive protocol with $n$ uses of the black-box channel. The top row makes use of the given black-box $\mathcal{E}|\mathcal{F}$, which is either $\mathcal{E}$ or $\mathcal{F}$, while the bottom row depicts the memory system $R$. At various stages in the protocol, the green states $\rho$ occur if the channel is $\mathcal{E}$ and the purple states $\sigma$ occur if the channel is $\mathcal{F}$.
Figure 2: Illustration of a key step in our proof, the construction of the parallel input state. We start by picking a single step $\ell \in \{1, \ldots, n\}$ out of the adaptive protocol, where the distinguishability increase $D(\mathcal{E}(\rho_\ell)\|\mathcal{F}(\sigma_\ell)) - D(\rho_\ell\|\sigma_\ell)$ is maximal (this corresponds to the step from the orange to the dotted grey line in the diagram). Now consider $m$ copies of the adaptive strategy in parallel. We construct our parallel input state $\nu$ starting from $m$ copies of the input state of the adaptive strategy at this step $\ell$ if the channel was $\mathcal{E}$ (this is $\rho_\ell^{\otimes m})$. The state $\nu$ is then smoothed a bit to reduce its distance to $\sigma_\ell^{\otimes m}$ (which is the input state that we would have if the channel was $\mathcal{F}$). The degree to which we smooth depends on the type I error $\alpha_p$ we want to achieve with the parallel strategy. Having a small type I error means that the state $\nu$ is very close to $\rho_\ell^{\otimes m}$, whereas allowing for a larger type I error will move the state closer to $\sigma_\ell^{\otimes m}$.
Figure 3: Illustration of the type II error decay rate per channel use of a simple adaptive and parallel strategy for a specific pair of channels (see \ref{['eq:example-channel-1']} and \ref{['eq:example-channel-2']} for the definitions of the channels $\mathcal{E}$ and $\mathcal{F}$, respectively, where $\kappa = 2^{-50}$). We compare a fixed adaptive strategy with two channel uses (constant black line) to (i) our lower bound on the performance of a parallel strategy (yellow line) and (ii) the actual performance of a parallel strategy (red and green lines), which are plotted as functions of the number of parallel channel uses $m$. The black line shows the value of \ref{['eq:example_two_uses']}, i.e., the type II error exponent for the given adaptive strategy with two channel uses and type I error $\alpha_a = 0$. This can alternatively be thought of as the rate of repeating the two-step adaptive strategy $m/2$ times in parallel. The yellow line shows the lower bound on the parallel strategy from our theorem (i.e., the right-hand side of \ref{['eq:thm_seq_par']}), choosing $\alpha_p = 2^{-5}$. For this specific example we can calculate the parallel input state $\nu$ of our theorem, and while we cannot explicitly find the optimal POVM and corresponding type II error (i.e., we cannot explicitly calculate the left-hand side of \ref{['eq:thm_seq_par']}), we can bound it from above and below using the second-order asymptotics of the hypothesis testing relative entropy, which is shown in the red and green lines, corresponding to the values of \ref{['eq:example_second_order_upper']} and \ref{['eq:example_second_order_lower']}. We see that for small $m$ there is a gap between the adaptive and parallel strategies; i.e., the adaptive strategy offers an advantage. This advantage disappears once $m$ gets larger and in this specific example the chosen adaptive strategy even eventually gets surpassed by the parallel strategy, as the adaptive strategy turns out not to be asymptotically optimal.

Theorems & Definitions (27)

Lemma 1: Upper bound on $D_H^\varepsilon$
Lemma 2: Relation to smoothed max-divergence anshu_minimax_2019
Remark 3
Lemma 4: Better for large/small $\varepsilon$, but only for $n$ large enough
proof
Corollary 5: Main result, simple version
Remark 6
Remark 7
Theorem 8: Main result, technical version
Remark 9
...and 17 more

Parallelization of Adaptive Quantum Channel Discrimination in the Non-Asymptotic Regime

TL;DR

Abstract

Parallelization of Adaptive Quantum Channel Discrimination in the Non-Asymptotic Regime

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (27)