Improving Generalization and Trainability of Quantum Eigensolvers via Graph Neural Encoding

Jungyun Lee; Daniel K. Park

Improving Generalization and Trainability of Quantum Eigensolvers via Graph Neural Encoding

Jungyun Lee, Daniel K. Park

TL;DR

This work tackles the challenge of generalizing ground-state preparation across Hamiltonians and mitigating barren plateaus in variational quantum eigensolvers. It introduces EGATE-NNVQE, an end-to-end framework that encodes arbitrary one- and two-local Pauli Hamiltonians as Hamiltonian-graphs and learns structure-aware embeddings via an edge-featured graph attention autoencoder (EGATE). A downstream neural predictor then uses the learned latent representation to generate VQE parameters for unseen Hamiltonians, yielding high overlap with the true ground state and improved trainability. The approach yields strong generalization across 1D and 2D spin models, accelerates convergence in quantum subspace methods such as SKQD, and demonstrates robust resistance to barren plateaus, highlighting practical impact for near-term and fault-tolerant quantum algorithms.

Abstract

Determining the ground state of a many-body Hamiltonian is a central problem across physics, chemistry, and combinatorial optimization, yet it is often classically intractable due to the exponential growth of Hilbert space with system size. Even on fault-tolerant quantum computers, quantum algorithms with convergence guarantees -- such as quantum phase estimation and quantum subspace methods -- require an initial state with sufficiently large overlap with the true ground state to be effective. Variational quantum eigensolvers (VQEs) are natural candidates for preparing such states; however, standard VQEs typically exhibit poor generalization, requiring retraining for each Hamiltonian instance, and often suffer from barren plateaus, where gradients can vanish exponentially with circuit depth and system size. To address these limitations, we propose an end-to-end representation learning framework that combines a graph autoencoder with a classical neural network to generate VQE parameters that generalize across Hamiltonian instances. By encoding interaction topology and coupling structure, the proposed model produces high-overlap initial states without instance-specific optimization. Through extensive numerical experiments on families of one- and two-local Hamiltonians, we demonstrate improved generalization and trainability, manifested as reduced test error and a significantly milder decay of gradient variance. We further show that our method substantially accelerates convergence in quantum subspace-based eigensolvers, highlighting its practical impact for downstream quantum algorithms.

Improving Generalization and Trainability of Quantum Eigensolvers via Graph Neural Encoding

TL;DR

Abstract

Paper Structure (25 sections, 17 equations, 9 figures, 4 tables, 2 algorithms)

This paper contains 25 sections, 17 equations, 9 figures, 4 tables, 2 algorithms.

Introduction
Related Works
Model Description
Hamiltonian-Graph
Edge-Featured Graph Attention Autoencoder
Overall Workflow
Result
Generalization
Application to Quantum Subspace Methods
Barren Plateaus
Conclusion and Future Work
Algorithmic Architecture
EGATE
Node Module & Edge Module
Decoder
...and 10 more sections

Figures (9)

Figure 1: (a) Hamiltonian-graph for a general two-local 4-qubit Hamiltonian, as defined in Eq. (\ref{['eq:general two-local H']}), and (b) Hamiltonian-graph for a 4-qubit 1D chain. If the two-local edge features in (b) are set to $\vec{e}_{ij} = (1, 1, \lambda)$ and the one-local edge features to $\vec{s}_i = (\Delta)$, the graph represents the 1D XXZ Heisenberg spin chain described in Eq. (\ref{['eq:XXZ spin']})
Figure 2: Architecture of EGATE and its components. The left panel illustrates the encoder-decoder structure of EGATE. The encoder consists of a GNN block for message passing followed by a pooling block that compresses the input H-graph $\textbf{G} = (\textbf{O}, \textbf{E})$ into a latent vector $\vec{g}$, while the decoder reconstructs the H-graph $\textbf{G}' = (\textbf{O}', \textbf{E}')$ from this latent representation. The center of the figure depicts the problem-specific GNN block, composed of stacked EGAT layers and a merge layer that jointly update node and edge features. The right panel details a single EGAT layer: the top and bottom schematics illustrate the update procedures of the node and edge modules, respectively. Here, $\alpha$ and $\beta$ denote attention coefficients. The corresponding pseudocode is provided in Algs. \ref{['appendix:alg:node module']} and \ref{['appendix:alg:edge module']}.
Figure 3: Schematic overview of EGATE-NNVQE for training (left) and inference (right).
Figure 4: Generalization performance of NNVQE baseline and EGATE-NNVQE on 1D Hamiltonian families, $H_{\text{XXZ}}$ (Eq. (\ref{['eq:XXZ']})) and $H_{\text{XXZ+X}}$ (Eq. (\ref{['eq:XXZ+X']})), with system size $n\in{\lbrace4,6,8\rbrace}$ qubits, where $D$ denotes the number of ansatz blocks. Performance is evaluated using MSE and MRE (lower is better, $\downarrow$), and MF (higher is better, $\uparrow$). Rows correspond to the number of hidden layers (HL = 1 or 2) in the NNVQE. Bars indicate mean $\pm$ standard deviation over 10 random seeds, where each seed’s value is the average over all test instances. In (a), each seed uses 20 training and 1,000 test instances of $H_\text{XXZ}$; in (b), 400 training and 40,000 test instances of $H_\text{XXZ+X}$. The right axis (Impr.) reports the relative improvement (%) of EGATE-NNVQE (green) over the NNVQE (orange) baseline, shown as a blue dashed line. EGATE-NNVQE consistently improves all metrics across settings, with relative improvements that remain comparable or increase as the system size ($n$) grows, indicating that the NNVQE baseline increasingly struggles to capture relevant structural information at larger scales. To isolate the role of the EGATE representation, (a) includes input-expanded-NNVQE (gray), where the NNVQE input dimension is matched to the EGATE latent size. Its performance is comparable to, or worse than, that of the NNVQE baseline, indicating that the observed improvements arise from the structured latent representation learned by EGATE rather than from increased dimensionality or model capacity.
Figure 5: Generalization performance of NNVQE baseline and EGATE-NNVQE on 2D Hamiltonians on $3\times3$ lattices, $H_{\mathrm{XXZ}}^{3\times3}$ (Eq. (\ref{['eq:2D_XXZ']})) and $H_{\mathrm{XYZ}}^{3\times3}$ (Eq. \ref{['eq:2D_XYZ']})). The same performance metrics and notation as in Fig. \ref{['fig:1D_generalization_result']} are used. In contrast to the one-dimensional case, the system size is fixed ($n=9$, $D=2$), and the horizontal axis indexes the Hamiltonian family. Results are reported as mean $\pm$ standard deviation over 10 random seeds, where each seed’s value is averaged over all test instances. We use 400 training and 40,000 test instances for $H_{\mathrm{XXZ}}^{3\times3}$, and 405 training and 32,000 test instances for $H_{\mathrm{XYZ}}^{3\times3}$. Consistent with Fig. \ref{['fig:1D_generalization_result']}, EGATE-NNVQE outperforms NNVQE baseline across all metrics for both Hamiltonian families.
...and 4 more figures

Improving Generalization and Trainability of Quantum Eigensolvers via Graph Neural Encoding

TL;DR

Abstract

Improving Generalization and Trainability of Quantum Eigensolvers via Graph Neural Encoding

Authors

TL;DR

Abstract

Table of Contents

Figures (9)