Table of Contents
Fetching ...

Reinforcement learning of quantum circuit architectures for molecular potential energy curves

Maureen Krumtünger, Alissa Wilms, Paul K. Faehrmann, Jens Eisert, Jakob Kottmann, Paolo Andrea Erdman, Sumeet Khatri

TL;DR

This work tackles the scalability challenge of designing quantum circuit ansätze for molecular PECs by introducing a transferable reinforcement-learning framework that learns an $R$-dependent circuit $\,\hat{U}(R,\boldsymbol{\theta}(R))\,$ to prepare the ground state across bond lengths. Using a discrete-continuous Soft Actor-Critic (SAC) strategy, the agent learns both gate placements and continuous angles conditioned on bond distance $R$, enabling direct access to $E(R)$ and $|\psi(R)\rangle$ across the PEC. The approach yields chemically accurate energies for four- and six-qubit LiH and eight-qubit H$_4$, with shallow, interpretable circuits and substantial transferability across bond distances, offering a pathway toward generalizable quantum circuit design for larger molecular systems. While the framework remains challenging to scale due to the expansive action space and simulation-based training, it demonstrates meaningful transferability, interpretability, and potential for integration with existing greedy or fragment-based methods to improve future quantum chemistry workflows.

Abstract

Quantum chemistry and optimization are two of the most prominent applications of quantum computers. Variational quantum algorithms have been proposed for solving problems in these domains. However, the design of the quantum circuit ansatz remains a challenge. Of particular interest is developing a method to generate circuits for any given instance of a problem, not merely a circuit tailored to a specific instance of the problem. To this end, we present a reinforcement learning (RL) approach to learning a problem-dependent quantum circuit mapping, which outputs a circuit for the ground state of a Hamiltonian from a given family of parameterized Hamiltonians. For quantum chemistry, our RL framework takes as input a molecule and a discrete set of bond distances, and it outputs a bond-distance-dependent quantum circuit for arbitrary bond distances along the potential energy curve. The inherently non-greedy approach of our RL method contrasts with existing greedy approaches to adaptive, problem-tailored circuit constructions. We demonstrate its effectiveness for the four-qubit and six-qubit lithium hydride molecules, as well as an eight-qubit H$_4$ chain. Our learned circuits are interpretable in a physically meaningful manner, thus paving the way for applying RL to the development of novel quantum circuits for the ground states of large-scale molecular systems.

Reinforcement learning of quantum circuit architectures for molecular potential energy curves

TL;DR

This work tackles the scalability challenge of designing quantum circuit ansätze for molecular PECs by introducing a transferable reinforcement-learning framework that learns an -dependent circuit to prepare the ground state across bond lengths. Using a discrete-continuous Soft Actor-Critic (SAC) strategy, the agent learns both gate placements and continuous angles conditioned on bond distance , enabling direct access to and across the PEC. The approach yields chemically accurate energies for four- and six-qubit LiH and eight-qubit H, with shallow, interpretable circuits and substantial transferability across bond distances, offering a pathway toward generalizable quantum circuit design for larger molecular systems. While the framework remains challenging to scale due to the expansive action space and simulation-based training, it demonstrates meaningful transferability, interpretability, and potential for integration with existing greedy or fragment-based methods to improve future quantum chemistry workflows.

Abstract

Quantum chemistry and optimization are two of the most prominent applications of quantum computers. Variational quantum algorithms have been proposed for solving problems in these domains. However, the design of the quantum circuit ansatz remains a challenge. Of particular interest is developing a method to generate circuits for any given instance of a problem, not merely a circuit tailored to a specific instance of the problem. To this end, we present a reinforcement learning (RL) approach to learning a problem-dependent quantum circuit mapping, which outputs a circuit for the ground state of a Hamiltonian from a given family of parameterized Hamiltonians. For quantum chemistry, our RL framework takes as input a molecule and a discrete set of bond distances, and it outputs a bond-distance-dependent quantum circuit for arbitrary bond distances along the potential energy curve. The inherently non-greedy approach of our RL method contrasts with existing greedy approaches to adaptive, problem-tailored circuit constructions. We demonstrate its effectiveness for the four-qubit and six-qubit lithium hydride molecules, as well as an eight-qubit H chain. Our learned circuits are interpretable in a physically meaningful manner, thus paving the way for applying RL to the development of novel quantum circuits for the ground states of large-scale molecular systems.

Paper Structure

This paper contains 53 sections, 40 equations, 21 figures, 3 tables.

Figures (21)

  • Figure 1: Problem: We consider the ground state energy of a molecular system, with Hamiltonian $\hat{H}(R)$, as a function of the bond distance $R\in[R_{\min},R_{\max}]$. Training: The RL agent is exposed to a discrete set $\{R_1,\dots,R_M\}$ of bond distances during training. During each episode, the agent sequentially constructs a quantum circuit from scratch by selecting gates, from a predefined gate set, and their corresponding parameters $a_t=(d_t,c_t)$. After each gate selection, the transition $(s_t(R),a_t,r_{t+1},s_{t+1}(R))$ is collected to populate the replay buffer. At regular intervals, batches are sampled from the replay buffer to update the agent’s policy network parameters $\phi$ via the Soft Actor-Critic (SAC) algorithm. Prediction: After training, the policy can predict individually adapted circuits for arbitrary bond distances $R$ within the training interval, enabling direct access to the PEC and corresponding wavefunctions.
  • Figure 2: Progression of the obtained energies (top) and errors (bottom) during training of one agent for four-qubit LiH at a fixed bond distance of 2.2 Å. Each point represents the final energy and error of one episode. Blue points indicate non-evaluation energies, while orange points represent evaluation energies. The dashed green line denotes the HF energy (top) and the HF error (bottom). The red line in the top plot represents the FCI energy. The gray area marks the region within chemical accuracy. The final learned energy and error are marked by a red point and are $-7.8441$ Ha and $0.0007$ Ha, respectively. The corresponding hyper-parameters are listed in Appendix \ref{['app:hyper-parameters']}.
  • Figure 3: Learned circuit corresponding to the final learned energy of the four-qubit LiH molecule shown in Fig. \ref{['fig:LiH4_energy']}. The circuit has a depth of six and consists of 12 gates, selected by the agent, with the first two grey-colored gates predefined to initialize the system in the HF state.
  • Figure 4: Energy error relative to the FCI energy per added gate of the learned circuit shown in Fig. \ref{['LiH4_circuit']} for the four-qubit LiH molecule.
  • Figure 5: Potential energy curve for four-qubit LiH (top) and absolute energy differences with respect to the FCI energy (bottom). The orange energies and errors correspond to bond distances that were seen during training, while the blue energies and errors indicate predictions for unseen bond distances. The red curve represents the exact potential energy curve for the four-qubit LiH system, and the gray area indicates the region of chemical accuracy. The green curve represents the HF approximation of the potential energy curve. The hyper-parameters are listed in Appendix \ref{['app:hyper-parameters']}.
  • ...and 16 more figures