Table of Contents
Fetching ...

Why is topology hard to learn?

D. O. Oriekhov, Stan Bergkamp, Guliuxin Jin, Juan Daniel Torres Luna, Badr Zouggari, Sibren van der Meer, Naoual El Yazidi, Eliska Greplova

TL;DR

The paper investigates why topology is hard to learn by constructing a hybrid tensor-neural network that exactly expresses the real-space winding number (RSWN) for SSH-like models (AIII class) and benchmarking it against physics-agnostic networks. It finds that while the exact-topology network can encode the invariant with high precision, it suffers from reduced trainability and sensitivity to initialization, whereas simpler, symmetry-informed or fully generic networks can achieve strong performance but may rely on dataset proxies rather than the invariant itself. Through weight analysis and SVD pruning, the authors show that topology classification often hinges on a compact set of edge-state features rather than the full global invariant, and that disorder erodes simple correlation-based learning. This work clarifies interpretable ML strategies for condensed-matter topology and suggests architectures that better generalize under real-world imperfections.

Abstract

Much attention has been devoted to the use of machine learning to approximate physical concepts. Yet, due to challenges in interpretability of machine learning techniques, the question of what physics machine learning models are able to learn remains open. Here we bridge the concept a physical quantity and its machine learning approximation in the context of the original application of neural networks in physics: topological phase classification. We construct a hybrid tensor-neural network object that exactly expresses real space topological invariant and rigorously assess its trainability and generalization. Specifically, we benchmark the accuracy and trainability of a tensor-neural network to multiple types of neural networks, thus exemplifying the differences in trainability and representational power. Our work highlights the challenges in learning topological invariants and constitutes a stepping stone towards more accurate and better generalizable machine learning representations in condensed matter physics.

Why is topology hard to learn?

TL;DR

The paper investigates why topology is hard to learn by constructing a hybrid tensor-neural network that exactly expresses the real-space winding number (RSWN) for SSH-like models (AIII class) and benchmarking it against physics-agnostic networks. It finds that while the exact-topology network can encode the invariant with high precision, it suffers from reduced trainability and sensitivity to initialization, whereas simpler, symmetry-informed or fully generic networks can achieve strong performance but may rely on dataset proxies rather than the invariant itself. Through weight analysis and SVD pruning, the authors show that topology classification often hinges on a compact set of edge-state features rather than the full global invariant, and that disorder erodes simple correlation-based learning. This work clarifies interpretable ML strategies for condensed-matter topology and suggests architectures that better generalize under real-world imperfections.

Abstract

Much attention has been devoted to the use of machine learning to approximate physical concepts. Yet, due to challenges in interpretability of machine learning techniques, the question of what physics machine learning models are able to learn remains open. Here we bridge the concept a physical quantity and its machine learning approximation in the context of the original application of neural networks in physics: topological phase classification. We construct a hybrid tensor-neural network object that exactly expresses real space topological invariant and rigorously assess its trainability and generalization. Specifically, we benchmark the accuracy and trainability of a tensor-neural network to multiple types of neural networks, thus exemplifying the differences in trainability and representational power. Our work highlights the challenges in learning topological invariants and constitutes a stepping stone towards more accurate and better generalizable machine learning representations in condensed matter physics.

Paper Structure

This paper contains 7 sections, 18 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Neural network architectures: (a) topological neural network that combines a tensor layer and one neural layer that exactly reproduces the structure of real space winding number for SSH-like two-band models within the AIII symmetry class. (b) preprocessed neural network: the tensor layer remains included, but the remainder of the network consists of multi-layer classifier that no longer exactly corresponds to the topological invariant. (c) feed-forward neural network is a fully-connected classifier without the preprocessing layer and without the default correspondence to the topological invariant. Input size for (a) and (b) networks scales as $N^2$ for $N$ unit cells. The input size for network (c) is $4N^2$.
  • Figure 2: MSE loss for the RSWN regression: Panel (a) shows MSE loss on training data, and (b) on validation. Three models were used to learn the float value of RSWN. The topological model ('topo') is in blue, the preprocessed model ('pre-proc') is in red and orange for SGD and Adam, respectively. FNN model is in dark (light) green for SGD (Adam).
  • Figure 3: Generalization of the topological, pre-processed and FNN models: each neural network is trained on one non-disordered $d=0$ type of dataset corresponding to a model Hamiltonian of SSH (a, b) or ESSH (c, d), and tested on another Hamiltonian dataset as well as both disordered types (color is indicated in legend). Model types are indicated in the titles of panels. We train a topological model (a,c) for different initialization variances and a pre-processed model (b, d) for five different seeds. The FNN model results are the dashed lines in each panel.
  • Figure 4: An evolution of the topological model weights during training: the orange dots are initial values of each weight, the blue line corresponds to the training trajectory and dark blue dot is the final point. In panel (a) the model initialization is centered around the correct RSWN formula, the weights are initialized from the Gaussian distribution centered around $1.0$; (b) the Gaussian distribution for weights initialization is shifted away from the correct RSWN formula and is centered around $1.5$; (c) the Gaussian distribution for the weights initialization is centered around $0.5$.
  • Figure S1: A change of architecture of a "topology-inspired" neural network under the properties of SSH-type modes at half-filling. The property of bands under chiral symmetry is included to change the tensor network layer from a 4-leg type to a 2-leg type. The corresponding sizes of input and the number of weights in the regression layer are (a) $N^4$ and (b) $N^2$.
  • ...and 6 more figures