Table of Contents
Fetching ...

Clustering Cluster Algebras with Clusters

Man-Wai Cheung, Pierre-Philippe Dechant, Yang-Hui He, Elli Heyes, Edward Hirst, Jian-Rong Li

TL;DR

This work addresses the problem of classifying cluster variables in Grassmannian cluster algebras by encoding them as semistandard Young tableaux and generating large datasets via tableau mutations on HPC resources. It combines rigorous algebraic structure with machine learning, showing that both supervised classifiers and unsupervised PCA/K-Means can reveal meaningful separations by rank, $(k,n)$, and tableau structure, and even conjecture enumeration formulas for the number of cluster variables at given ranks. The study provides high-accuracy discrimination of cluster variables from non-cluster tableaux, identifies key features via gradient saliency, and makes the generated datasets publicly available to support future investigations in mathematics and physics. The results demonstrate the utility of data-driven approaches in uncovering intricate combinatorial patterns within Grassmannian cluster algebras and their applications to scattering amplitudes.

Abstract

Classification of cluster variables in cluster algebras (in particular, Grassmannian cluster algebras) is an important problem, which has direct application to computations of scattering amplitudes in physics. In this paper, we apply the tableaux method to classify cluster variables in Grassmannian cluster algebras $\mathbb{C}[Gr(k,n)]$ up to $(k,n)=(3,12), (4,10)$, or $(4,12)$ up to a certain number of columns of tableaux, using HPC clusters. These datasets are made available on GitHub. Supervised and unsupervised machine learning methods are used to analyse this data and identify structures associated to tableaux corresponding to cluster variables. Conjectures are raised associated to the enumeration of tableaux at each rank and the tableaux structure which creates a cluster variable, with the aid of machine learning.

Clustering Cluster Algebras with Clusters

TL;DR

This work addresses the problem of classifying cluster variables in Grassmannian cluster algebras by encoding them as semistandard Young tableaux and generating large datasets via tableau mutations on HPC resources. It combines rigorous algebraic structure with machine learning, showing that both supervised classifiers and unsupervised PCA/K-Means can reveal meaningful separations by rank, , and tableau structure, and even conjecture enumeration formulas for the number of cluster variables at given ranks. The study provides high-accuracy discrimination of cluster variables from non-cluster tableaux, identifies key features via gradient saliency, and makes the generated datasets publicly available to support future investigations in mathematics and physics. The results demonstrate the utility of data-driven approaches in uncovering intricate combinatorial patterns within Grassmannian cluster algebras and their applications to scattering amplitudes.

Abstract

Classification of cluster variables in cluster algebras (in particular, Grassmannian cluster algebras) is an important problem, which has direct application to computations of scattering amplitudes in physics. In this paper, we apply the tableaux method to classify cluster variables in Grassmannian cluster algebras up to , or up to a certain number of columns of tableaux, using HPC clusters. These datasets are made available on GitHub. Supervised and unsupervised machine learning methods are used to analyse this data and identify structures associated to tableaux corresponding to cluster variables. Conjectures are raised associated to the enumeration of tableaux at each rank and the tableaux structure which creates a cluster variable, with the aid of machine learning.
Paper Structure (23 sections, 22 equations, 6 figures, 4 tables)

This paper contains 23 sections, 22 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Example images produced from the padded versions of the SSYT representing cluster variables in the respective Grassmannians. Note that for ${\mathbb {C}}[\mathop{\mathrm{Gr}}\nolimits(k,n)]$$k$ represents the number of rows, $n$ the maximum entry, and $r$ the rank and hence the number of columns. These example images have the maximum rank in each case.
  • Figure 2: PCA decomposition (linear kernel) of the SSYT CV Grassmannian and NCV data for each of the respective datasets. The PCA shows that the NCV data generation is representative in the principal components.
  • Figure 3: PCA decomposition of the SSYT data for the 3 Grassmannians, using (a) linear and (b) Gaussian kernels respectively. Note there is significant overlap between ${\mathbb {C}}[\mathop{\mathrm{Gr}}\nolimits(4,10)]$$r6$ and ${\mathbb {C}}[\mathop{\mathrm{Gr}}\nolimits(4,12)]$$r4$ as expected, and cluster separation is largely due to padding -- hence correctly clustering according to rank. The Gaussian kernel PCA was computed over a sample of 10,000 CV SSYT from each Grassmannian due to memory limits with the full datasets.
  • Figure 4: PCA decomposition (linear kernel) of the ${\mathbb {C}}[\mathop{\mathrm{Gr}}\nolimits(3,12)]$ SSYT data, plotted with partitions according to the rank $r$ or maximum entry $n$. The PCA shows that the clusters separate according to rank, whilst the differing values of $n$ expand the cluster sizes, akin to a mussel. Equivalent behaviour also holds for the other Grassmannians considered.
  • Figure 5: The elbow method for determining the optimum number of K-Means clusters when clustering the ${\mathbb {C}}[\mathop{\mathrm{Gr}}\nolimits(3,12)]$$r6$ dataset with penalty factor of 0.01, discouraging too many clusters.
  • ...and 1 more figures

Theorems & Definitions (4)

  • Example 2.1
  • Remark 2.2
  • Conjecture 3.1
  • Conjecture 3.2