Table of Contents
Fetching ...

Exploring fixed points and eigenstates of quantum systems with reinforcement learning

María Laura Olivera-Atencio, Jesús Casado-Pascual, Denis Lacroix

TL;DR

The work develops a quantum reinforcement-learning algorithm to identify the fixed-point basis of a quantum operation, effectively yielding the Hamiltonian eigenbasis when the fixed points are stationary under unitary evolution. It learns a global unitary $\mathbf{D}$ by composing two-qudit rotations with a reward/punishment scheme, while varying the evolution time $\tau$ to avoid spurious invariants. Demonstrations on random two- and three-qubit Hamiltonians and on physical models—TFIM and Richardson pairing up to six qubits—show high fidelities, with the approach uncovering underlying symmetries and enabling symmetry-restricted learning and post-selection via energy fluctuations. The method provides a non-variational, potentially scalable alternative to variational eigensolvers, with possibilities to extend to open/dissipative quantum dynamics and to leverage symmetry structure for efficiency.

Abstract

We introduce a reinforcement learning algorithm designed to identify the fixed points of a given quantum operation. The method iteratively constructs the unitary transformation that maps the computational basis onto the basis of fixed points through a reward-penalty scheme based on quantum measurements. In cases where the operation corresponds to a Hamiltonian evolution, this task reduces to determining the Hamiltonian eigenstates. The algorithm is first benchmarked on random Hamiltonians acting on two and three qubits and then applied to many-body systems of up to six qubits, including the transverse-field Ising model and the all-to-all pairing Hamiltonian. In both cases, the algorithm is demonstrated to perform successfully; in the pairing model, it can also reveal hidden symmetries, which can be exploited to restrict learning to specific symmetry sectors. Finally, we discuss the possibility of post-selecting high-fidelity states even when full convergence has not been reached.

Exploring fixed points and eigenstates of quantum systems with reinforcement learning

TL;DR

The work develops a quantum reinforcement-learning algorithm to identify the fixed-point basis of a quantum operation, effectively yielding the Hamiltonian eigenbasis when the fixed points are stationary under unitary evolution. It learns a global unitary by composing two-qudit rotations with a reward/punishment scheme, while varying the evolution time to avoid spurious invariants. Demonstrations on random two- and three-qubit Hamiltonians and on physical models—TFIM and Richardson pairing up to six qubits—show high fidelities, with the approach uncovering underlying symmetries and enabling symmetry-restricted learning and post-selection via energy fluctuations. The method provides a non-variational, potentially scalable alternative to variational eigensolvers, with possibilities to extend to open/dissipative quantum dynamics and to leverage symmetry structure for efficiency.

Abstract

We introduce a reinforcement learning algorithm designed to identify the fixed points of a given quantum operation. The method iteratively constructs the unitary transformation that maps the computational basis onto the basis of fixed points through a reward-penalty scheme based on quantum measurements. In cases where the operation corresponds to a Hamiltonian evolution, this task reduces to determining the Hamiltonian eigenstates. The algorithm is first benchmarked on random Hamiltonians acting on two and three qubits and then applied to many-body systems of up to six qubits, including the transverse-field Ising model and the all-to-all pairing Hamiltonian. In both cases, the algorithm is demonstrated to perform successfully; in the pairing model, it can also reveal hidden symmetries, which can be exploited to restrict learning to specific symmetry sectors. Finally, we discuss the possibility of post-selecting high-fidelity states even when full convergence has not been reached.

Paper Structure

This paper contains 15 sections, 21 equations, 10 figures.

Figures (10)

  • Figure 1: Schematic representation of the proposed RL algorithm.
  • Figure 2: Algorithm results for random two-qubit Hamiltonians ($d=4$). Panels (a) and (b) show the fidelities $F_k^{(j)}$ and the maximum exploration parameter $W_k^{(\mathrm{M})}$ as functions of the iteration number $k$ without the reset mechanism described in Sec. \ref{['subsec:algoft']}, while panels (c) and (d) show the results with reset. Each realization uses a different random Hamiltonian, and in each iteration the dimensionless evolution time $\tilde{\tau}$ is uniformly sampled from $[0,100]$. Parameters are $r=0.9$, $p=2/r$, $w_{\mathrm{th}}=0.005$, $w_{\mathrm{r}}=0.01$, and $N_{\mathrm{r}}=100$. Fidelities for the four computational basis states are plotted in different colors; however, because the curves largely overlap, the colors are difficult to distinguish. In the left panels, dashed horizontal lines indicate the maximum and minimum fidelities, $F_{\mathrm{max}}$ and $F_{\mathrm{min}}$.
  • Figure 3: Same as Fig. \ref{['fig2']}, but for the case of three qubits ($d=8$). All parameters are identical to those used for the two-qubit case shown in Fig. \ref{['fig2']}, except for the increased Hilbert-space dimension.
  • Figure 4: Results of the algorithm applied to the TFIM Hamiltonian in Eq. (\ref{['eq:HTFIM']}) with $J/h = 1$ and $K/h = 0.5$. The remaining parameters are $r = 0.9$, $p = 2/r$, $w_{\mathrm{th}} = 0.005$, $w_{\mathrm{r}} = 0.05$, and $N_{\mathrm{r}} = 50$. In each iteration, the dimensionless evolution time $\tilde{\tau}$ is uniformly sampled from $[0,600]$. The upper panels [(a) and (b)] correspond to the two-qubit case ($d=4$), the middle panels [(c) and (d)] to three qubits ($d=8$), and the lower panels [(e) and (f)] to four qubits ($d=16$). In the left panels [(a),(c), and (e)], the mean fidelities $F_k^{(j)}$ associated with each computational-basis state $\ket{j}$ are shown as a function of the iteration number $k$, with different colors indicating different states. Since many curves overlap, color differences may be difficult to distinguish. Dashed horizontal lines indicate the maximum and minimum fidelities, $F_{\mathrm{max}}$ and $F_{\mathrm{min}}$. The right panels [(b), (d), and (f)] display, for each computational basis state $\ket{j}$, the dimensionless expectation values of the energies obtained after each stochastic realization has converged, plotted against the corresponding realization index. In these panels, horizontal solid lines represent the exact eigenenergies computed by numerical diagonalization.
  • Figure 5: Results analogous to those shown in Fig. \ref{['fig4']}, obtained for $r = 0.93$. All other parameters are identical to those used in Fig. \ref{['fig4']}.
  • ...and 5 more figures