Table of Contents
Fetching ...

Understanding the Nature of Depth-1 Equivariant Quantum Circuit

Jonathan Teo, Lee Xin Wei, Hoong Chuin Lau

TL;DR

This paper investigates the depth-1 Equivariant Quantum Circuit (EQC) for solving the Traveling Salesman Problem within Quantum Reinforcement Learning (QRL). It introduces Size-Invariant Grid Search (SIGS), a lightweight training optimization that leverages size-invariant properties to replicate depth-1 EQC performance while dramatically reducing runtime, enabling simulations up to 350-node TSP instances. The work provides both theoretical results—characterizing the favorable parameter region and constraining the search space—and extensive simulations showing SIGS matches RL-based training in performance with orders of magnitude faster evaluation. It further analyzes limitations of depth-1 EQCs, demonstrates limited gains from deeper EQCs in this setting, and positions SIGS as a practical benchmarking tool for the QRL community, with potential applicability to broader combinatorial optimization problems.

Abstract

The Equivariant Quantum Circuit (EQC) for the Travelling Salesman Problem (TSP) has been shown to achieve near-optimal performance in solving small TSP problems (up to 20 nodes) using only two parameters at depth 1. However, extending EQCs to larger TSP problem sizes remains challenging due to the exponential time and memory for quantum circuit simulation, as well as increasing noise and decoherence when running on actual quantum hardware. In this work, we propose the Size-Invariant Grid Search (SIGS), an efficient training optimization for Quantum Reinforcement Learning (QRL), and use it to simulate the outputs of a trained Depth-1 EQC up to 350-node TSP instances - well beyond previously tractable limits. At TSP with 100 nodes, we reduce total simulation times by 96.4%, when comparing to RL simulations with the analytical expression (151 minutes using RL to under 6 minutes using SIGS on TSP-100), while achieving a mean optimality gap within 0.005 of the RL trained model on the test set. SIGS provides a practical benchmarking tool for the QRL community, allowing us to efficiently analyze the performance of QRL algorithms on larger problem sizes. We provide a theoretical explanation for SIGS called the Size-Invariant Properties that goes beyond the concept of equivariance discussed in prior literature.

Understanding the Nature of Depth-1 Equivariant Quantum Circuit

TL;DR

This paper investigates the depth-1 Equivariant Quantum Circuit (EQC) for solving the Traveling Salesman Problem within Quantum Reinforcement Learning (QRL). It introduces Size-Invariant Grid Search (SIGS), a lightweight training optimization that leverages size-invariant properties to replicate depth-1 EQC performance while dramatically reducing runtime, enabling simulations up to 350-node TSP instances. The work provides both theoretical results—characterizing the favorable parameter region and constraining the search space—and extensive simulations showing SIGS matches RL-based training in performance with orders of magnitude faster evaluation. It further analyzes limitations of depth-1 EQCs, demonstrates limited gains from deeper EQCs in this setting, and positions SIGS as a practical benchmarking tool for the QRL community, with potential applicability to broader combinatorial optimization problems.

Abstract

The Equivariant Quantum Circuit (EQC) for the Travelling Salesman Problem (TSP) has been shown to achieve near-optimal performance in solving small TSP problems (up to 20 nodes) using only two parameters at depth 1. However, extending EQCs to larger TSP problem sizes remains challenging due to the exponential time and memory for quantum circuit simulation, as well as increasing noise and decoherence when running on actual quantum hardware. In this work, we propose the Size-Invariant Grid Search (SIGS), an efficient training optimization for Quantum Reinforcement Learning (QRL), and use it to simulate the outputs of a trained Depth-1 EQC up to 350-node TSP instances - well beyond previously tractable limits. At TSP with 100 nodes, we reduce total simulation times by 96.4%, when comparing to RL simulations with the analytical expression (151 minutes using RL to under 6 minutes using SIGS on TSP-100), while achieving a mean optimality gap within 0.005 of the RL trained model on the test set. SIGS provides a practical benchmarking tool for the QRL community, allowing us to efficiently analyze the performance of QRL algorithms on larger problem sizes. We provide a theoretical explanation for SIGS called the Size-Invariant Properties that goes beyond the concept of equivariance discussed in prior literature.

Paper Structure

This paper contains 28 sections, 7 theorems, 27 equations, 16 figures, 4 tables.

Key Result

Lemma 1

(Relative Ordering of Q-values) For a fixed, unexplored node $a$ and a set of candidate last nodes $T$, the Q-value of visiting node $a$ from node $t \in T$ is proportional to $d_{ta} \tan (\gamma e_{ta})$.

Figures (16)

  • Figure 1: Equivariant Quantum Circuit (EQC) (Depth 1). $\gamma$ and $\beta$ are trainable parameters. Each qubit $q_i$ corresponds to the node $i$. So an $n=4$ qubit system can serve as a PQC for TSP-4 instances. The state $s_i$ is equal to $\pi$ if node $i$ is yet to be visited in the tour, otherwise $s_i = 0$.
  • Figure 2: Visualization of $e_{t_1a}<e_{t_2a} < e_{t_3a}$ (Theorem 2)
  • Figure 3: Comparison of Mean Gaps, Worst Gaps on the Test Set and Total Training Time between RL and SIGS. Details of SIGS is discussed in Section \ref{['sect:size-invariant-method-train-method']}. The methodology for "RL" is discussed in Supplementary Information \ref{['sect:experimental-config-qrl']} . Tabulated results can be found in Supplementary Information \ref{['appendix:size-invariant-tm-full-comp']} .
  • Figure 4: Mean Test Gaps, Total time and value of $\gamma^*_{SIGS}$ when scaling up to TSP-350, using $\Delta\gamma = 0.1$. Note that the time taken consists of the time taken to select parameter $\gamma$ and its evaluation on the test set
  • Figure 5: Mean Optimality Gap Heatmap of Depth 1 EQC with respect to $\gamma,\beta$. These landscapes are Mean Gaps (see equation \ref{['eq:mean-and-worst-gap']}) over 50 instances for TSP-10 (Figure \ref{['fig:grid-search-heatmap-tsp10']}) and 30 instances for TSP-20 and above (Figure \ref{['fig:grid-search-heatmap-tsp20']} to \ref{['fig:grid-search-heatmap-tsp50']}). Smaller Optimality Gaps are color-coded as brighter colors, while larger optimality gaps are encoded with darker colors. TSP 10 (Figure \ref{['fig:grid-search-heatmap-tsp10']}) and 20 (Figure \ref{['fig:grid-search-heatmap-tsp20']}) heatmaps span the entire parameter space $\gamma,\beta \in (0, 2\pi)$. TSP 25, 30, 40, and 50 (Figures \ref{['fig:grid-search-heatmap-tsp25']}-\ref{['fig:grid-search-heatmap-tsp50']}) are heatmaps of subsets of the parameter space to verify the size-invariant landscape. $\beta \in \{0.95,1.05, 2.2, 3.3,4.4\}$ is plotted on the vertical axis, while $\gamma \in \{0.8,1.0,\cdots, 2.0\}$ is plotted on the horizontal axis. As these heatmaps was created in the earlier phases of the project, we approximated the optimal tour to be the best of 30 runs of the simulated annealing implementation in dreo:hal-01341683 for TSP-20 to TSP-50.
  • ...and 11 more figures

Theorems & Definitions (23)

  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Definition 1
  • Theorem 2
  • proof
  • Definition 2
  • Lemma 2
  • proof
  • ...and 13 more