Table of Contents
Fetching ...

Learning Chern Numbers of Topological Insulators with Gauge Equivariant Neural Networks

Longde Huang, Oleksandr Balabanov, Hampus Linander, Mats Granath, Daniel Persson, Jan E. Gerken

TL;DR

This work develops gauge-equivariant networks that respect local $U(N)$ gauge symmetry to predict discretized Chern numbers $\tilde{C}$ in multiband topological insulators. By integrating new gauge-equivariant layers (GEBL, GEAct, GEConv) and a stabilization layer (TrNorm), the approach achieves robust learning where non-equivariant models fail, and it proves a universal approximation theorem for class functions on $G=\mathrm{U}(N)$. The authors demonstrate strong generalization from trivial to nontrivial topology and extend results to higher dimensions (4D Chern numbers), with detailed ablations and data-generation strategies based on Wilson loops. The framework provides a scalable, locality-preserving method to extract topological invariants from discretized band structures, offering a practical tool for exploring topological phases in complex, multi-band systems.

Abstract

Equivariant network architectures are a well-established tool for predicting invariant or equivariant quantities. However, almost all learning problems considered in this context feature a global symmetry, i.e. each point of the underlying space is transformed with the same group element, as opposed to a local ``gauge'' symmetry, where each point is transformed with a different group element, exponentially enlarging the size of the symmetry group. Gauge equivariant networks have so far mainly been applied to problems in quantum chromodynamics. Here, we introduce a novel application domain for gauge-equivariant networks in the theory of topological condensed matter physics. We use gauge equivariant networks to predict topological invariants (Chern numbers) of multiband topological insulators. The gauge symmetry of the network guarantees that the predicted quantity is a topological invariant. We introduce a novel gauge equivariant normalization layer to stabilize the training and prove a universal approximation theorem for our setup. We train on samples with trivial Chern number only but show that our models generalize to samples with non-trivial Chern number. We provide various ablations of our setup. Our code is available at https://github.com/sitronsea/GENet/tree/main.

Learning Chern Numbers of Topological Insulators with Gauge Equivariant Neural Networks

TL;DR

This work develops gauge-equivariant networks that respect local gauge symmetry to predict discretized Chern numbers in multiband topological insulators. By integrating new gauge-equivariant layers (GEBL, GEAct, GEConv) and a stabilization layer (TrNorm), the approach achieves robust learning where non-equivariant models fail, and it proves a universal approximation theorem for class functions on . The authors demonstrate strong generalization from trivial to nontrivial topology and extend results to higher dimensions (4D Chern numbers), with detailed ablations and data-generation strategies based on Wilson loops. The framework provides a scalable, locality-preserving method to extract topological invariants from discretized band structures, offering a practical tool for exploring topological phases in complex, multi-band systems.

Abstract

Equivariant network architectures are a well-established tool for predicting invariant or equivariant quantities. However, almost all learning problems considered in this context feature a global symmetry, i.e. each point of the underlying space is transformed with the same group element, as opposed to a local ``gauge'' symmetry, where each point is transformed with a different group element, exponentially enlarging the size of the symmetry group. Gauge equivariant networks have so far mainly been applied to problems in quantum chromodynamics. Here, we introduce a novel application domain for gauge-equivariant networks in the theory of topological condensed matter physics. We use gauge equivariant networks to predict topological invariants (Chern numbers) of multiband topological insulators. The gauge symmetry of the network guarantees that the predicted quantity is a topological invariant. We introduce a novel gauge equivariant normalization layer to stabilize the training and prove a universal approximation theorem for our setup. We train on samples with trivial Chern number only but show that our models generalize to samples with non-trivial Chern number. We provide various ablations of our setup. Our code is available at https://github.com/sitronsea/GENet/tree/main.

Paper Structure

This paper contains 40 sections, 8 theorems, 49 equations, 11 figures, 4 tables.

Key Result

Theorem 1

For a compact Lie group $G$, and with the nonlinearity $\sigma$ in GEAct taking the form $\Tilde{\sigma}\circ \textrm{Re}$, where $\sigma$ is bounded and non-decreasing, GEBLNet could approximate any class function on $G$.

Figures (11)

  • Figure 1: Best relative error of predicted matrix determinants for polynomial architectures as a function of increasing matrix size. The ablations include layers up to order 4. Dashed line indicates relative error of a mean predictor. Architectures considered include layers of order $\leq 4$, and depth $\leq 4$, containing terms of up to order $16$ by composition.
  • Figure 2: Architecture of GEBLNet. In this figure, the rectangles represent the spatial grid, and the number of layers ($N_\text{ch}$) represents the number of channels ($\gamma$). Each circle represents a site on the grid, and quantities on different sites do not interact with each other, until the last summation on grids.
  • Figure 3: Comparison of statistics of $\|\text{Tr} W'^\gamma_k\|$ across each layer, between two training runs on a $5\times 5$ grid, with $4$ filled bands. The two runs have identical configurations, except for the implementation of TrNorm Layers.
  • Figure 4: Comparison of global and standard deviation loss on validation data between the same two runs learning on only zero Chern numbers, as shown in Figure \ref{['fig:stats-trnorm']}. The former, without TrNorm layers, collapses to zero local quantities everywhere, hence having a lower $L_{\mathrm{g}}$ on trivial samples than the latter with TrNorm layers. Nevertheless, the former could not generalize to nontrivial cases. In contrast, the latter, with TrNorm layers, succeeds in learning global quantities and local differences simultaneously.
  • Figure 5: Comparison of rescaled local outputs with local true values. Points closer to the reference line $y = x$ indicate higher accuracy in capturing local quantities.
  • ...and 6 more figures

Theorems & Definitions (14)

  • Definition 1
  • Theorem 1: Universal Approximation Theorem
  • Theorem 2
  • proof
  • Corollary 3
  • proof : Sketch of Proof
  • Proposition 4
  • proof
  • Proposition 5: Vec Operator Identity
  • Proposition 6
  • ...and 4 more