Table of Contents
Fetching ...

Enhancing variational quantum state diagonalization using reinforcement learning techniques

Akash Kundu, Przemysław Bedełek, Mateusz Ostaszewski, Onur Danaci, Yash J. Patel, Vedran Dunjko, Jarosław A. Miszczak

TL;DR

This work addresses diagonalizing quantum states on NISQ hardware by augmenting Variational Quantum State Diagonalization (VQSD) with Reinforcement Learning (RL) to automate compact ansatz construction. It introduces a binary, depth-based RL-state encoding and a dense reward within a Double Deep Q-Network framework, achieving significantly shallower and fewer-gate circuits while preserving or improving eigenvalue accuracy. Across 2-, 3-, and 4-qubit test cases, RL-VQSD outperforms fixed-structure Layered Hardware Efficient Ansatz (LHEA) and exhibits strong scaling advantages, aided by an encoding and reward design that tightly couples the RL agent to the VQSD objective. The approach is readily adaptable to other variational quantum algorithms and demonstrates the potential of RL-based architecture search for quantum data processing on near-term devices.

Abstract

The variational quantum algorithms are crucial for the application of NISQ computers. Such algorithms require short quantum circuits, which are more amenable to implementation on near-term hardware, and many such methods have been developed. One of particular interest is the so-called variational quantum state diagonalization method, which constitutes an important algorithmic subroutine and can be used directly to work with data encoded in quantum states. In particular, it can be applied to discern the features of quantum states, such as entanglement properties of a system, or in quantum machine learning algorithms. In this work, we tackle the problem of designing a very shallow quantum circuit, required in the quantum state diagonalization task, by utilizing reinforcement learning (RL). We use a novel encoding method for the RL-state, a dense reward function, and an $ε$-greedy policy to achieve this. We demonstrate that the circuits proposed by the reinforcement learning methods are shallower than the standard variational quantum state diagonalization algorithm and thus can be used in situations where hardware capabilities limit the depth of quantum circuits. The methods we propose in the paper can be readily adapted to address a wide range of variational quantum algorithms.

Enhancing variational quantum state diagonalization using reinforcement learning techniques

TL;DR

This work addresses diagonalizing quantum states on NISQ hardware by augmenting Variational Quantum State Diagonalization (VQSD) with Reinforcement Learning (RL) to automate compact ansatz construction. It introduces a binary, depth-based RL-state encoding and a dense reward within a Double Deep Q-Network framework, achieving significantly shallower and fewer-gate circuits while preserving or improving eigenvalue accuracy. Across 2-, 3-, and 4-qubit test cases, RL-VQSD outperforms fixed-structure Layered Hardware Efficient Ansatz (LHEA) and exhibits strong scaling advantages, aided by an encoding and reward design that tightly couples the RL agent to the VQSD objective. The approach is readily adaptable to other variational quantum algorithms and demonstrates the potential of RL-based architecture search for quantum data processing on near-term devices.

Abstract

The variational quantum algorithms are crucial for the application of NISQ computers. Such algorithms require short quantum circuits, which are more amenable to implementation on near-term hardware, and many such methods have been developed. One of particular interest is the so-called variational quantum state diagonalization method, which constitutes an important algorithmic subroutine and can be used directly to work with data encoded in quantum states. In particular, it can be applied to discern the features of quantum states, such as entanglement properties of a system, or in quantum machine learning algorithms. In this work, we tackle the problem of designing a very shallow quantum circuit, required in the quantum state diagonalization task, by utilizing reinforcement learning (RL). We use a novel encoding method for the RL-state, a dense reward function, and an -greedy policy to achieve this. We demonstrate that the circuits proposed by the reinforcement learning methods are shallower than the standard variational quantum state diagonalization algorithm and thus can be used in situations where hardware capabilities limit the depth of quantum circuits. The methods we propose in the paper can be readily adapted to address a wide range of variational quantum algorithms.
Paper Structure (27 sections, 13 equations, 12 figures, 7 tables)

This paper contains 27 sections, 13 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Elements of Variational Quantum State Diagonalization (VQSD) algorithm. In the presented example, we consider the diagonalization for the $2$-qubit input state. It should be noted that to diagonalize the $N$ qubit quantum state the algorithm requires $2N$ number of qubits in the algorithm.
  • Figure 2: Structure of a layered hardware efficient ansatz, where the ansatz $U_l(\vec{\theta})$ is decomposed into layer-wise unitaries $U_l(\vec{\theta}_l)$ for $l = 1,2,\ldots, l$. Each gate $U_l(\vec{\theta}_l)$ is further decomposed into two-qubit rotations. For $\vec{\theta}_i^j$, index $i$ denotes the layer number, and $j$ is the index specifying the parameter count.
  • Figure 3: Two possible decompositions of the two-qubit rotations in each layer-wise unitary $U_i(\vec{\theta}_i)$. It can be constructed into two forms with (\ref{['fig:layered-anzatz-one']}) one and (\ref{['fig:layered-anzatz-three']}) three parameters respectively.
  • Figure 4: Illustration of the RL-VQSD process. In this process, the VQA is represented as the environment and the ansatz as the RL-state. The RL-agent receives the optimized cost function in the form of a reward and the RL-state from the environment. Following an $\epsilon$-greedy policy, the agent then decides on an action (i.e., a quantum gate), which in the next step updates the RL-state. Utilizing the new RL-state the VQA optimizes the cost function and generates a new reward function to feed it to the agent. This process is repeated until all the steps in an episode are exhausted, or the cost function reaches a predefined threshold value. Throughout the paper, we start the RL-VQSD with an empty circuit and at each step, the agent chooses an action to construct the RL-ansatz, indicating $U(\vec{\alpha})=\mathbb{I}$.
  • Figure 6: The summary of results for diagonalizing full rank $2$-qubit random density matrix. In (\ref{['fig:2-qubit-eigenvalue-convergence']}) we illustrate eigenvalue convergence for the diagonalization of a single mixed quantum state. In (\ref{['fig:2_qubit_average_error_50_states']}) we compare the performance of the RL-agent-generated ansatz with the LHEA. It can be seen that the RL-agent-generated ansatz gives us a better approximation of the eigenvalues. Additionally, the RL-based methods can achieve the accuracy of the LHEA using the circuit with significantly reduced depth of the resulting circuit.
  • ...and 7 more figures