Table of Contents
Fetching ...

Interpretable Artificial Intelligence (AI) Analysis of Strongly Correlated Electrons

Changkai Zhang, Jan von Delft

TL;DR

The study introduces transformer-inspired AI workflows for analyzing snapshots from tensor-network simulations of the 2D Hubbard model, targeting strongly correlated electron phenomena. It compares a core semi-linear attention architecture with an encoder-like pro architecture across a 9-category temperature–doping dataset, achieving strong classification performance and enabling interpretable dynamics through a Markov-process view of attention. A confusion-analysis framework reveals robust, category-specific correlation patterns, and a universal omnimeter leverages classifier posteriors to infer multiple observables from ensembles, including a 25-category extension that improves thermometry in ultracold-atom simulations. The approach demonstrates a principled, end-to-end pipeline—from data generation to interpretable inference—that can extend to other lattice models and experimental contexts, with potential impact on quantum many-body analysis and quantum simulation thermometry.

Abstract

Artificial Intelligence (AI) has become an exceptionally powerful tool for analyzing scientific data. In particular, attention-based architectures have demonstrated a remarkable capability to capture complex correlations and to furnish interpretable insights into latent, otherwise inconspicuous patterns. This progress motivates the application of AI techniques to the analysis of strongly correlated electrons, which remain notoriously challenging to study using conventional theoretical approaches. Here, we propose novel AI workflows for analyzing snapshot datasets from tensor-network simulations of the two-dimensional (2D) Hubbard model over a broad range of temperature and doping. The 2D Hubbard model is an archetypal strongly correlated system, hosting diverse intriguing phenomena including Mott insulators, anomalous metals, and high-$T_c$ superconductivity. Our AI techniques yield fresh perspectives on the intricate quantum correlations underpinning these phenomena and facilitate universal omnimetry for ultracold-atom simulations of the corresponding strongly correlated systems.

Interpretable Artificial Intelligence (AI) Analysis of Strongly Correlated Electrons

TL;DR

The study introduces transformer-inspired AI workflows for analyzing snapshots from tensor-network simulations of the 2D Hubbard model, targeting strongly correlated electron phenomena. It compares a core semi-linear attention architecture with an encoder-like pro architecture across a 9-category temperature–doping dataset, achieving strong classification performance and enabling interpretable dynamics through a Markov-process view of attention. A confusion-analysis framework reveals robust, category-specific correlation patterns, and a universal omnimeter leverages classifier posteriors to infer multiple observables from ensembles, including a 25-category extension that improves thermometry in ultracold-atom simulations. The approach demonstrates a principled, end-to-end pipeline—from data generation to interpretable inference—that can extend to other lattice models and experimental contexts, with potential impact on quantum many-body analysis and quantum simulation thermometry.

Abstract

Artificial Intelligence (AI) has become an exceptionally powerful tool for analyzing scientific data. In particular, attention-based architectures have demonstrated a remarkable capability to capture complex correlations and to furnish interpretable insights into latent, otherwise inconspicuous patterns. This progress motivates the application of AI techniques to the analysis of strongly correlated electrons, which remain notoriously challenging to study using conventional theoretical approaches. Here, we propose novel AI workflows for analyzing snapshot datasets from tensor-network simulations of the two-dimensional (2D) Hubbard model over a broad range of temperature and doping. The 2D Hubbard model is an archetypal strongly correlated system, hosting diverse intriguing phenomena including Mott insulators, anomalous metals, and high- superconductivity. Our AI techniques yield fresh perspectives on the intricate quantum correlations underpinning these phenomena and facilitate universal omnimetry for ultracold-atom simulations of the corresponding strongly correlated systems.

Paper Structure

This paper contains 19 sections, 39 equations, 23 figures, 4 tables.

Figures (23)

  • Figure 1: A schematic depiction of the locations in phase space for the nine categories (Cat), created by combining three choices of temperatures (high, medium, and low) with three doping regimes (over-doped, medium-doped, and under-doped). The red and blue freehand-shaded areas mark the AFM Mott insulating phase and the high-$T_c$ superconducting phase, respectively, as expected for the Hubbard model. The charge doping varies with temperature (see also Fig. \ref{['fig:Omnimetry']}); precise values are provided in the supplemental material supplemental.
  • Figure 2: Schematic illustrations of the core (left) and the pro (right) architecture for classification of sequential inputs. Both architectures comprise input codecs, multi-head attention blocks, feed-forward networks and a final linear classification head. The pro architecture is an analog of the encoder-only transformer, while the core architecture leaves out the feed-forward networks between attention blocks which enhances parallelism and improves interpretability.
  • Figure 3: Training profiles and benchmarks of the production and baseline models following the core and pro architectures. Metrics are displayed as raw data (thin lines with muted color) and with an exponential smoothing factor $\alpha\!=\!0.4$ (thick lines with deep color). For baseline models (with trivialized attention), the pro architecture consistently outperforms the core variant across all metrics, congruous with the anticipated benefits of elevated non-linearity in the pro model. By contrast, for production models (with full-functional attention), the core architecture achieves merely negligible gaps in performance, indicating an alignment of the semi-linear attention with the intrinsic properties of the dataset.
  • Figure 4: (a) Sensitivity matrix (row-normalized confusion matrix) and (b) precision matrix (column-normalized confusion matrix) for the core model. Color intensity indicates the degree of sensitivity (a) and precision (b), with exact values written within each cell. Top rows indicate the charge doping and temperature of the corresponding categories. Diagonal entries show (a) the probability of correct classification for each category and (b) the probability of a predicted category being correct.
  • Figure 5: Error in the omnimeter estimation of (a) the thermal exponent and (b) charge doping. Open circles/pentagrams mark the locations in phase space of the snapshot ensembles under evaluation, with pentagrams (circles) indicating data included (not included) in the training set. Color scales are obtained via interpolation. Overall performance is good, except for the bands at the unseen doping levels (around 7% and 17% in (a) and around 17% in (b)).
  • ...and 18 more figures