Table of Contents
Fetching ...

Hamiltonian-based Quantum Reinforcement Learning for Neural Combinatorial Optimization

Georg Kruse, Rodrigo Coehlo, Andreas Rosskopf, Robert Wille, Jeanette Miriam Lorenz

TL;DR

This work tackles binary combinatorial optimization by marrying quantum computing with neural reinforcement learning through a Hamiltonian-based QRL framework. The method constructs variational quantum circuit ansatzes directly from the problem's QUBO Hamiltonian (notably the sge-sgv design), enabling application to a broad class of CO problems and addressing trainability via inequality encoding and correlated parameterization. Empirical results on Weighted-MaxCut, Unit Commitment, and Knapsack show that Hamiltonian-based QRL can outperform QAOA in identifying optimal or valid solutions, while maintaining feasibility through masking and problem-aligned rewards, albeit with higher training cost. Overall, the approach offers a scalable, generalizable quantum-classical learning paradigm with promising potential for hardware-efficient implementations and strong generalization to unseen problem instances.

Abstract

Advancements in Quantum Computing (QC) and Neural Combinatorial Optimization (NCO) represent promising steps in tackling complex computational challenges. On the one hand, Variational Quantum Algorithms such as QAOA can be used to solve a wide range of combinatorial optimization problems. On the other hand, the same class of problems can be solved by NCO, a method that has shown promising results, particularly since the introduction of Graph Neural Networks. Given recent advances in both research areas, we introduce Hamiltonian-based Quantum Reinforcement Learning (QRL), an approach at the intersection of QC and NCO. We model our ansatzes directly on the combinatorial optimization problem's Hamiltonian formulation, which allows us to apply our approach to a broad class of problems. Our ansatzes show favourable trainability properties when compared to the hardware efficient ansatzes, while also not being limited to graph-based problems, unlike previous works. In this work, we evaluate the performance of Hamiltonian-based QRL on a diverse set of combinatorial optimization problems to demonstrate the broad applicability of our approach and compare it to QAOA.

Hamiltonian-based Quantum Reinforcement Learning for Neural Combinatorial Optimization

TL;DR

This work tackles binary combinatorial optimization by marrying quantum computing with neural reinforcement learning through a Hamiltonian-based QRL framework. The method constructs variational quantum circuit ansatzes directly from the problem's QUBO Hamiltonian (notably the sge-sgv design), enabling application to a broad class of CO problems and addressing trainability via inequality encoding and correlated parameterization. Empirical results on Weighted-MaxCut, Unit Commitment, and Knapsack show that Hamiltonian-based QRL can outperform QAOA in identifying optimal or valid solutions, while maintaining feasibility through masking and problem-aligned rewards, albeit with higher training cost. Overall, the approach offers a scalable, generalizable quantum-classical learning paradigm with promising potential for hardware-efficient implementations and strong generalization to unseen problem instances.

Abstract

Advancements in Quantum Computing (QC) and Neural Combinatorial Optimization (NCO) represent promising steps in tackling complex computational challenges. On the one hand, Variational Quantum Algorithms such as QAOA can be used to solve a wide range of combinatorial optimization problems. On the other hand, the same class of problems can be solved by NCO, a method that has shown promising results, particularly since the introduction of Graph Neural Networks. Given recent advances in both research areas, we introduce Hamiltonian-based Quantum Reinforcement Learning (QRL), an approach at the intersection of QC and NCO. We model our ansatzes directly on the combinatorial optimization problem's Hamiltonian formulation, which allows us to apply our approach to a broad class of problems. Our ansatzes show favourable trainability properties when compared to the hardware efficient ansatzes, while also not being limited to graph-based problems, unlike previous works. In this work, we evaluate the performance of Hamiltonian-based QRL on a diverse set of combinatorial optimization problems to demonstrate the broad applicability of our approach and compare it to QAOA.
Paper Structure (19 sections, 19 equations, 9 figures)

This paper contains 19 sections, 19 equations, 9 figures.

Figures (9)

  • Figure 1: Single-layer VQC for QRL: $U(s,\theta)$ generally consists of three blocks which are repeated in each layer: An encoding-block, where the features of state $s$ (possibly scaled by additional trainable parameters $\lambda$) are encoded. A variational-block, where additional parameterized quantum gates are placed, and an entangling-block, where the entanglement gates are placed. However, this structure is not static and blocks may be switched or merged with one another.
  • Figure 2: Schematic illustration of a single layer of a three qubit VQC for the sets of generators $G_{PP,P}$ in Eq. \ref{['generator_1']} (left) and $G_{PP+P,P}$ in Eq. \ref{['generator_2']} (right) of the sge-sgv ansatz.
  • Figure 3: Schematic illustration of a single layer of a three qubit VQC for the sets of generators $G_{mge-sgv}$ in Eq. \ref{['eq:mge-sgv']} (upper left), $G_{mge-sgv}$ in Eq. \ref{['eq:mge-mgv']} (upper right), and $G_{sge-sgv+hea}$ in Eq. \ref{['eq:sge-sgv+hea']} (lower center).
  • Figure 4: Hamiltonian-based QRL: A single layer of the sge-sgv ansatz consists of an encoding-block, where the features of state $s$ with trainable parameters $\theta$ are encoded, and a variational-block, where additional parameterized quantum gates are placed. If additional annotations $\alpha$ are used, we also refer to the variational-block as annotation-block.
  • Figure 5: Numerical results of the variance of the cost function partial derivatives for the introduced ansatzes with $L=5$ layers. For each point, we evaluated the variance of the gradient for 1000 samples.
  • ...and 4 more figures