Table of Contents
Fetching ...

Accelerating Quantum Monte Carlo Calculations with Set-Equivariant Architectures and Transfer Learning

Manuel Gallego, Sebastián Roca-Jerat, David Zueco, Jesús Carrete

TL;DR

The paper tackles the bottleneck of evaluating observables in variational quantum Monte Carlo by leveraging set-equivariant set-transformer architectures that operate on QMC samples as permutation-invariant sets. It demonstrates both regression (magnetization moments and Rényi entropy) and classification (phase detection) tasks for a 1D long-range spin chain, achieving three-to-four orders of magnitude speedups in observable estimation and enabling transfer learning to reuse knowledge across system sizes. The approach yields accurate phase boundaries and finite-size scaling exponents consistent with literature, while reducing training costs through data augmentation and partial weight freezing. The practical impact is a scalable, data-efficient framework for QMC analyses of complex spin systems, with caveats tied to ground-state quality and required data for target sizes.

Abstract

Machine-learning (ML) ansätze have greatly expanded the accuracy and reach of variational quantum Monte Carlo (QMC) calculations, in particular when exploring the manifold quantum phenomena exhibited by spin systems. However, the scalability of QMC is still compromised by several other bottlenecks, and specifically those related to the actual evaluation of observables based on random deviates that lies at the core of the approach. Here we show how the set-transformer architecture can be used to dramatically accelerate or even bypass that step, especially for time-consuming operators such as powers of the magnetization. We illustrate the procedure with a range of examples structured around quantum spin systems with long-range interactions, and comprising both regressions (to predict observables) and classifications (to detect phase transitions). Moreover, we show how transfer learning can be leveraged to reduce the training cost by reusing knowledge from different systems and smaller system sizes.

Accelerating Quantum Monte Carlo Calculations with Set-Equivariant Architectures and Transfer Learning

TL;DR

The paper tackles the bottleneck of evaluating observables in variational quantum Monte Carlo by leveraging set-equivariant set-transformer architectures that operate on QMC samples as permutation-invariant sets. It demonstrates both regression (magnetization moments and Rényi entropy) and classification (phase detection) tasks for a 1D long-range spin chain, achieving three-to-four orders of magnitude speedups in observable estimation and enabling transfer learning to reuse knowledge across system sizes. The approach yields accurate phase boundaries and finite-size scaling exponents consistent with literature, while reducing training costs through data augmentation and partial weight freezing. The practical impact is a scalable, data-efficient framework for QMC analyses of complex spin systems, with caveats tied to ground-state quality and required data for target sizes.

Abstract

Machine-learning (ML) ansätze have greatly expanded the accuracy and reach of variational quantum Monte Carlo (QMC) calculations, in particular when exploring the manifold quantum phenomena exhibited by spin systems. However, the scalability of QMC is still compromised by several other bottlenecks, and specifically those related to the actual evaluation of observables based on random deviates that lies at the core of the approach. Here we show how the set-transformer architecture can be used to dramatically accelerate or even bypass that step, especially for time-consuming operators such as powers of the magnetization. We illustrate the procedure with a range of examples structured around quantum spin systems with long-range interactions, and comprising both regressions (to predict observables) and classifications (to detect phase transitions). Moreover, we show how transfer learning can be leveraged to reduce the training cost by reusing knowledge from different systems and smaller system sizes.

Paper Structure

This paper contains 9 sections, 6 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Structure of the main building blocks of the set transformer. The multihead attention block (MAB) is the core component, defined as a standard multihead dot-product attention with LayerNorm and ReLU, but without positional encoding or dropout. The induced set-attention Block (ISAB) is an alternative to self-attention that preserves the set equivariance but improves scaling; it is constructed as $\mathop{\mathrm{ISAB}}\nolimits_m(*){\bm{X}} = \mathop{\mathrm{MAB}}\nolimits[*]{\bm{X}, \mathop{\mathrm{MAB}}\nolimits(*){\bm{I}, \bm{X}}}$ with $m < n$ trainable inducing points $\bm{I}$. The pooling by multihead attention (PMA) block defines the set-invariant decoder stage, specified as $\mathop{\mathrm{PMA}}\nolimits_k(*){\bm{X}} = \mathop{\mathrm{MAB}}\nolimits[*]{\bm{S}, \mathop{\mathrm{FF}}\nolimits(*){\bm{X}}}$, where $\bm{S}$ are $k$ trainable seed vectors and $FF$ is a feed-forward block.
  • Figure 2: Workflow of the transfer learning scheme. The model is first trained on data from the smaller system. A new model then inherits a subset of the trained parameters, which are frozen to prevent further updates. The remaining parameters are subsequently trained on data from the larger system. "Dense" refers to a fully connected block, whose precise form will be different for regressions and classifications.
  • Figure 3: Accuracy of the predictions with and without transfer learning for $N=100$ and $\alpha=2.5$. The standard error over the validation set is computed as $\delta m^2_z = \sqrt{\frac{1}{n-1}\sum_{i=1}^n(*){m_{z,i}^2-\tilde{m}_{z,i}^2}^2}$, where $n$ is the number of data points in that set, $m_{z,i}^2$ are the fluctuations of the magnetization calculated via QMC with original variational states and $2048$ samples, and $\tilde{m}_{z,i}^2$ are the set-transformer predictions for the same quantities.
  • Figure 4: Results of the nested binary classification of states into paramagnetic, ferromagnetic and antiferromagnetic categories for a system of $50$ spins throughout the ranges of $\alpha$ and $J$. This model was trained on $\alpha\in\lbrace*\rbrace{0, 3.2, 6}$, highlighted with black rectangles, with up to $1024$ samples per training point. Also shown are the phase boundaries extracted from Refs. romanroche2023zhu2018fidelitykoziol2021quantumcriticalsun2017
  • Figure 5: Regression results for the second moment of the magnetization for a system of $50$ spins throughout the ranges of $\alpha$ and $J$. Left:$\langle m^2 \rangle$ predicted by the set transformer. Right: difference between the predictions and the results obtained via QMC calculations, in logarithmic scale. This model was trained on $\alpha\in\lbrace*\rbrace{0, 3.2, 6}$, highlighted with black rectangles, with up to $1024$ samples per training point. Also shown are the phase boundaries extracted from Refs. romanroche2023zhu2018fidelitykoziol2021quantumcriticalsun2017.
  • ...and 3 more figures