Table of Contents
Fetching ...

Persistent Homology for Structural Characterization in Disordered Systems

An Wang, Li Zou

TL;DR

This work develops a unified persistent homology framework to bridge local particle environments and global material structure in disordered systems. By converting point-cloud representations into PH descriptors and persistence images, it enables both interpretable ML (via SVM) and non-ML analyses, including the novel Separation Index. The approach yields near-perfect three-phase classification with a linear SVM using Global Softness and demonstrates that a single PH-based variable can approximate global phase structure, while Shapley analyses reveal the dominant role of H_1 and H_2 topological features. The framework provides mechanistic insight into how local topology evolves into long-range order, with broad implications for materials science and complex systems analyses.

Abstract

We propose a unified framework based on persistent homology (PH) to characterize both local and global structures in disordered systems. It can simultaneously generate local and global descriptors using the same algorithm and data structure, and has shown to be highly effective and interpretable in predicting particle rearrangements and classifying global phases. We also demonstrated that using a single variable enables a linear SVM to achieve nearly perfect three-phase classification. Inspired by this discovery, we define a non-parametric metric, the Separation Index (SI), which not only achieves this classification without sacrificing significant performance but also establishes a connection between particle environments and the global phase structure. Our methods provide an effective framework for understanding and analyzing the properties of disordered materials, with broad potential applications in materials science and even wider studies of complex systems.

Persistent Homology for Structural Characterization in Disordered Systems

TL;DR

This work develops a unified persistent homology framework to bridge local particle environments and global material structure in disordered systems. By converting point-cloud representations into PH descriptors and persistence images, it enables both interpretable ML (via SVM) and non-ML analyses, including the novel Separation Index. The approach yields near-perfect three-phase classification with a linear SVM using Global Softness and demonstrates that a single PH-based variable can approximate global phase structure, while Shapley analyses reveal the dominant role of H_1 and H_2 topological features. The framework provides mechanistic insight into how local topology evolves into long-range order, with broad implications for materials science and complex systems analyses.

Abstract

We propose a unified framework based on persistent homology (PH) to characterize both local and global structures in disordered systems. It can simultaneously generate local and global descriptors using the same algorithm and data structure, and has shown to be highly effective and interpretable in predicting particle rearrangements and classifying global phases. We also demonstrated that using a single variable enables a linear SVM to achieve nearly perfect three-phase classification. Inspired by this discovery, we define a non-parametric metric, the Separation Index (SI), which not only achieves this classification without sacrificing significant performance but also establishes a connection between particle environments and the global phase structure. Our methods provide an effective framework for understanding and analyzing the properties of disordered materials, with broad potential applications in materials science and even wider studies of complex systems.

Paper Structure

This paper contains 40 sections, 16 equations, 19 figures, 7 tables.

Figures (19)

  • Figure 1: This figure is adapted from Ref. ghrist2008barcodes. It demonstrates barcodes, a faithful representation of the persistent homology (PH) results, showing how topological features emerge and persist as the parameter $\epsilon$ increases. The lines in the barcodes are categorized by their homology group ($H_0$, $H_1$, $H_2$), with each line representing a homology class. The left endpoint marks the feature’s birth at $\epsilon_i$, while the right endpoint indicates its death at $\epsilon_j$, with the length representing its persistence. Each line corresponds to either a birth-death pair $(\epsilon_i,\epsilon_j)$ or a birth-persistence pair $(\epsilon_i,\epsilon_j - \epsilon_i)$, where the persistence is the lifespan of the feature. The number of lines intersecting a vertical line at any $\epsilon$ represents the number of $H_k$ topological features at that scale, corresponding to the Betti number $\beta_k$. This paper focuses on $k=0,1,2$.
  • Figure 2: This figure shows the relationship between barcodes, persistence diagrams (PD), and persistence images (PI): a) shows barcodes, b) is the PD, and c) is the PI. PD maps barcode points to a 2D Cartesian system, and PI smooths these points using kernel-density estimation (KDE, see Eq. \ref{['eq:mapping']}), compressing varying PDs into fixed-size images for machine learning (ML) tasks.
  • Figure 3: A schematic diagram of a linear SVM, illustrating the decision hyperplane, support vectors, support vector hyperplanes, positive and negative samples, and the directed distance from a certain sample to the decision hyperplane. For convenience, this figure only illustrates the idealized two-dimensional case without loss of generality.
  • Figure 4: a) is the persistence diagram (PD) for a crystal ($\text{Traj}(T^{(1)}_5)$ at time step $t=900$), of which the homology classes of $H_0$ are distributed along a line near the y-axis, while the $H_1$ and $H_2$ classes are located within the yellow circle and magnified in b). The yellow solid line in b) divides the point sets of $H_1$ and $H_2$ into two disjoint parts. Inspired by this, the Separation Index (SI) is defined to measure the clarity of the boundary of the point sets for $H_1$ and $H_2$.
  • Figure 5: Box plots of the Seperation Index (SI) of a) all neighborhoods with multiple radii of particle samples at time step $t=100+100 \times (k-1), k \in [1,10] \cap \mathbb{N}$, and b) all systems sampled at time step $t=100+25 \times (k-1), k \in [1,33] \cap \mathbb{N}$, from each trajectory in Group 1 and 2.
  • ...and 14 more figures