Table of Contents
Fetching ...

Interpretable representation learning of quantum data enabled by probabilistic variational autoencoders

Paulin de Schoulepnikoff, Gorka Muñoz-Gil, Hendrik Poulsen Nautrup, Hans J. Briegel

TL;DR

The paper addresses the challenge of learning interpretable representations from intrinsically stochastic quantum data using variational autoencoders. It introduces a conditional probabilistic VAE (cpVAE) with an autoregressive decoder inspired by Neural Quantum States and a TC-VAE–based loss to learn the full quantum state distribution rather than single samples. Across benchmark spin models (NNN-TFIM and LR-TFIM) and experimental Rydberg-atom data, cpVAE yields latent factors that capture non-mean-field correlations and uncovers phase structure in an unsupervised manner. The work provides a practical framework for unsupervised, interpretable quantum state analysis with broad applicability to quantum simulators and beyond.

Abstract

Interpretable machine learning is rapidly becoming a crucial tool for scientific discovery. Among existing approaches, variational autoencoders (VAEs) have shown promise in extracting the hidden physical features of some input data, with no supervision nor prior knowledge of the system at study. Yet, the ability of VAEs to create meaningful, interpretable representations relies on their accurate approximation of the underlying probability distribution of their input. When dealing with quantum data, VAEs must hence account for its intrinsic randomness and complex correlations. While VAEs have been previously applied to quantum data, they have often neglected its probabilistic nature, hindering the extraction of meaningful physical descriptors. Here, we demonstrate that two key modifications enable VAEs to learn physically meaningful latent representations: a decoder capable of faithfully reproduce quantum states and a probabilistic loss tailored to this task. Using benchmark quantum spin models, we identify regimes where standard methods fail while the representations learned by our approach remain meaningful and interpretable. Applied to experimental data from Rydberg atom arrays, the model autonomously uncovers the phase structure without access to prior labels, Hamiltonian details, or knowledge of relevant order parameters, highlighting its potential as an unsupervised and interpretable tool for the study of quantum systems.

Interpretable representation learning of quantum data enabled by probabilistic variational autoencoders

TL;DR

The paper addresses the challenge of learning interpretable representations from intrinsically stochastic quantum data using variational autoencoders. It introduces a conditional probabilistic VAE (cpVAE) with an autoregressive decoder inspired by Neural Quantum States and a TC-VAE–based loss to learn the full quantum state distribution rather than single samples. Across benchmark spin models (NNN-TFIM and LR-TFIM) and experimental Rydberg-atom data, cpVAE yields latent factors that capture non-mean-field correlations and uncovers phase structure in an unsupervised manner. The work provides a practical framework for unsupervised, interpretable quantum state analysis with broad applicability to quantum simulators and beyond.

Abstract

Interpretable machine learning is rapidly becoming a crucial tool for scientific discovery. Among existing approaches, variational autoencoders (VAEs) have shown promise in extracting the hidden physical features of some input data, with no supervision nor prior knowledge of the system at study. Yet, the ability of VAEs to create meaningful, interpretable representations relies on their accurate approximation of the underlying probability distribution of their input. When dealing with quantum data, VAEs must hence account for its intrinsic randomness and complex correlations. While VAEs have been previously applied to quantum data, they have often neglected its probabilistic nature, hindering the extraction of meaningful physical descriptors. Here, we demonstrate that two key modifications enable VAEs to learn physically meaningful latent representations: a decoder capable of faithfully reproduce quantum states and a probabilistic loss tailored to this task. Using benchmark quantum spin models, we identify regimes where standard methods fail while the representations learned by our approach remain meaningful and interpretable. Applied to experimental data from Rydberg atom arrays, the model autonomously uncovers the phase structure without access to prior labels, Hamiltonian details, or knowledge of relevant order parameters, highlighting its potential as an unsupervised and interpretable tool for the study of quantum systems.

Paper Structure

This paper contains 22 sections, 19 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Schematic representation of the proposed pipeline.a) An experimental setup produces snapshots of a quantum system for different values of the experimental parameters $\theta_{1,2}$. b) A variational autoencoder (VAE) is trained on an ensemble of unlabeled snapshots collected from the previous experiment across a wide parameter range. Inspecting the learn latent neurons $\mathbf{z}$ across the parameter space will shed light about the system's phase space (see \ref{['sec:results']} for more). c) Schematic representation of the cpVAE. The encoder neural network, ENC, takes as input a spin configuration. The latter outputs the means $\mu_i$ and the variances $\sigma_i$ parameterizing the Gaussian distribution from which the latent variables $z_i$ are sampled. Then, the decoder neural network DEC takes as input the latent variables and outputs the conditional probabilities of each spin be in a given state (up or down in the present scheme). During training, the decoder also receives as input the spin configuration to be reconstructed by the conditional probabilities, as indicated with the dotted arrow. Once trained, the latent neurons $z_i$ encode the main physical features of the physical system, which directly relate to its phase space. Moreover, one can fix the value of the latent variables and generate new configurations by autoregressively sampling the output conditional probabilities.
  • Figure 2: Next-nearest-neighbor transverse-field Ising model.a) Next-nearest-neighbor correlator (left) and magnetization (right) computed from training set configurations across the phase space. (b,c) Next-nearest-neighbor correlator of the generated spin configurations across the phase space generated by the dVAE and cpVAE, respectively.
  • Figure 3: LR-TFIM. a) Magnetization (top) and $\beta$ exponent from \ref{['eq:corr_exp']} (bottom) computed from training set configurations across the phase space. (b,c) Absolute value of the learned latent space mean values $\mu_i$ for the two active neurons, for the dVAE and cpVAE, respectively, and for input configurations across the phase space. d) Structure factor from \ref{['eq:struc_factor']} with $k=0.8$ and $i=0$ for training set configurations across the phase space. (e,f)$\beta$ exponent for spin configurations generated by the dVAE and the cpVAE, respectively. The former has values $\beta=0$ for across the whole phase space.
  • Figure 4: Experimental Rydberg atoms array.a) Different Fourier-space order parameters (\ref{['eq:fourier']}) highlighting the location of the checkerboard, star and striated phases across the phase space. The boundary ordered phase is located by means of the difference between the nearest-neighbor correlator for atoms at the edge and the bulk of the array. The insets showcase exemplary configurations of each phase on a smaller $9\times 9$ lattice, with white, purple and orange circles representing atoms in the $\ket{g}$, $\ket{r}$ and $\ket{+_x}$ states, respectively. b) Latent representation learned by the cpVAE. For the two remaining active neurons, we present here the mean values (left) and variances (right) as output by the encoder, for input configurations across the phase space. We also show the absolute mean values (center), which showcase a finer resolution of the learned latent space.
  • Figure 5: Training: NNN-TFIM. Training of a) the dVAE and b) the cpVAE seen through the reconstruction loss $\mathcal{L}_{\mathrm{reconstr.}}$ and the $\log$ of the variances $\sigma_i$ output by the encoder. For the dVAE the reconstruction loss is the mean square error (MSE) and for the cpVAE it is the first term in \ref{['eq:final_loss']}.
  • ...and 5 more figures