Shortcut learning in geometric knot classification

Djordje Mihajlovic; Davide Michieletto

Shortcut learning in geometric knot classification

Djordje Mihajlovic, Davide Michieletto

TL;DR

This work interrogates whether ML knot-classification models truly learn ambient isotopy invariants or rely on geometric shortcuts. It introduces a mutual-information–based shortcut probe using geometric functionals $\{\phi_j\}$ (e.g., $\Sigma_{+}, \Omega_{+}, \kappa_{+}, M, \Pi_n$) and compares models trained on MD-generated data with those trained on geometrically unbiased GEOKNOT data, highlighting the role of data sampling. The authors show that MD datasets exhibit strong geometry–topology correlations driving shortcut learning, while GEOKNOT data do not, and they reveal that writhe-matrix representations can encode low-order topological information (e.g., a second Vassiliev invariant) that standard ML fails to recover. This motivates diagnostic tools and improved sampling strategies for true topology-aware learning in geometric knot problems, with public code and data to foster future development.

Abstract

Classifying the topology of closed curves is a central problem in low dimensional topology with applications beyond mathematics spanning protein folding, polymer physics and even magnetohydrodynamics. The central problem is how to determine whether two embeddings of a closed arc are equivalent under ambient isotopy. Given the striking ability of neural networks to solve complex classification tasks, it is therefore natural to ask if the knot classification problem can be tackled using Machine Learning (ML). In this paper, we investigate generic shortcut methods employed by ML to solve the knot classification challenge and specifically discover hidden non-topological features in training data generated through Molecular Dynamics simulations of polygonal knots that are used by ML to arrive to positive classifications results. We then provide a rigorous foundation for future attempts to tackle the knot classification challenge using ML by developing a publicly-available (i) dataset, that aims to remove the potential of non-topological feature classification and (ii) code, that can generate knot embeddings that faithfully explore chosen geometric state space with fixed knot topology. We expect that our work will accelerate the development of ML models that can solve complex geometric knot classification challenges.

Shortcut learning in geometric knot classification

TL;DR

(e.g.,

) and compares models trained on MD-generated data with those trained on geometrically unbiased GEOKNOT data, highlighting the role of data sampling. The authors show that MD datasets exhibit strong geometry–topology correlations driving shortcut learning, while GEOKNOT data do not, and they reveal that writhe-matrix representations can encode low-order topological information (e.g., a second Vassiliev invariant) that standard ML fails to recover. This motivates diagnostic tools and improved sampling strategies for true topology-aware learning in geometric knot problems, with public code and data to foster future development.

Abstract

Paper Structure (13 sections, 17 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 13 sections, 17 equations, 6 figures, 4 tables, 1 algorithm.

Introduction
Methods
Results
Discussion and Conclusions
Data and code availability
Acknowledgements
BFACF and pivot based knot sampling
Geometric features in the discretized, polygonal setting
Pairwise distance
Long-range entanglement
Total writhe
Average crossing number
Vassiliev invariant Feynman rules

Figures (6)

Figure 1: Shortcut learning of knot topology. A.) The unknot ($0_1$) and the Conway knot ($11_{34}$) have the same (trivial) Alexander polynomial. However one cannot be deformed into the other. B.) A sketch of a neural network taking a knot embedding coordinates as input and outputting a binary unknot versus trefoil ($0_1$ vs. $3_1$) classification. C.) An example of a configurational landscape of a knot embedding. The two main directions here are "size" and "writhe", i.e. the amount of self-crossing of the curve. Molecular Dynamics simulations intrinsically bias sampling towards low free energy embeddings and therefore generate knot conformations with narrow distributions of geometric properties. D.) Example of a training dataset in which the ML takes an obvious shortcut learning based on the size of embeddings. All the trivial knots ($0_1$) are smaller than the trefoils ($3_1$) in the dataset. When the ML is challenged with $0_1$ that are large, and $3_1$ that are small it leads to incorrect classification. Figure layout inspired from geirhos2020shortcut.
Figure 2: ML models fail to classify GEOKNOT dataset.A Examples of knot embeddings from MD simulations at low and high temperatures and of GEOKNOT. All three embeddings have 100 nodes. Plotted using KnotPlot scharein2002interactiveB Confusion matrices for models trained on $\text{MD}^{\text{Low T}}$, $\text{MD}^{\text{High T}}$, and GEOKNOT respectively with coordinate data as the feature input. C Confusion matrices for models trained on $\text{MD}^{\text{Low T}}$, $\text{MD}^{\text{High T}}$, and GEOKNOT respectively with writhe matrix data as the feature input.
Figure 3: MD datasets display limited sampling of geometries. Distributions of A total writhe, B average crossing number and C long-range entanglement for unknots ($0_1$) and trefoil ($3_1$) knots. From left to right we show distributions from MD low temp, high temp and GEOKNOT datasets. The definitions of these quantities are in Appendix \ref{['appendix:B']}.
Figure 4: Saliency analysis identify key geometric probes for correct classification. Saliency analysis on ML models trained on geometric probes using both low (A) and high (B) temperature MD (LAMMPS) datasets. At low temperature the $\Omega_+$ feature is by far the most important for correct classification, suggesting that total space writhe is a dominant shortcut feature.
Figure 5: ML model prediction is not invariant under ambient isotopy. In A and B we show two examples of $0_1$ embeddings generated by GEOKNOT that are misclassified as $3_1$ by ML models trained on MD data. The embeddings are minimised by KnotPlot until all non-topological self-crossings are removed. In A the model is trained on the XYZ coordinates of MD data. In B the model is trained on the writhe matrix of MD data. Note, in both plots a smoothed plot of the knot embedding is provided for visual clarity.
...and 1 more figures

Shortcut learning in geometric knot classification

TL;DR

Abstract

Shortcut learning in geometric knot classification

Authors

TL;DR

Abstract

Table of Contents

Figures (6)