Reinforcement learning of quantum circuit architectures for molecular potential energy curves
Maureen Krumtünger, Alissa Wilms, Paul K. Faehrmann, Jens Eisert, Jakob Kottmann, Paolo Andrea Erdman, Sumeet Khatri
TL;DR
This work tackles the scalability challenge of designing quantum circuit ansätze for molecular PECs by introducing a transferable reinforcement-learning framework that learns an $R$-dependent circuit $\,\hat{U}(R,\boldsymbol{\theta}(R))\,$ to prepare the ground state across bond lengths. Using a discrete-continuous Soft Actor-Critic (SAC) strategy, the agent learns both gate placements and continuous angles conditioned on bond distance $R$, enabling direct access to $E(R)$ and $|\psi(R)\rangle$ across the PEC. The approach yields chemically accurate energies for four- and six-qubit LiH and eight-qubit H$_4$, with shallow, interpretable circuits and substantial transferability across bond distances, offering a pathway toward generalizable quantum circuit design for larger molecular systems. While the framework remains challenging to scale due to the expansive action space and simulation-based training, it demonstrates meaningful transferability, interpretability, and potential for integration with existing greedy or fragment-based methods to improve future quantum chemistry workflows.
Abstract
Quantum chemistry and optimization are two of the most prominent applications of quantum computers. Variational quantum algorithms have been proposed for solving problems in these domains. However, the design of the quantum circuit ansatz remains a challenge. Of particular interest is developing a method to generate circuits for any given instance of a problem, not merely a circuit tailored to a specific instance of the problem. To this end, we present a reinforcement learning (RL) approach to learning a problem-dependent quantum circuit mapping, which outputs a circuit for the ground state of a Hamiltonian from a given family of parameterized Hamiltonians. For quantum chemistry, our RL framework takes as input a molecule and a discrete set of bond distances, and it outputs a bond-distance-dependent quantum circuit for arbitrary bond distances along the potential energy curve. The inherently non-greedy approach of our RL method contrasts with existing greedy approaches to adaptive, problem-tailored circuit constructions. We demonstrate its effectiveness for the four-qubit and six-qubit lithium hydride molecules, as well as an eight-qubit H$_4$ chain. Our learned circuits are interpretable in a physically meaningful manner, thus paving the way for applying RL to the development of novel quantum circuits for the ground states of large-scale molecular systems.
