Table of Contents
Fetching ...

Transforming Calabi-Yau Constructions: Generating New Calabi-Yau Manifolds with Transformers

Jacky H. T. Yip, Charles Arnal, François Charton, Gary Shiu

TL;DR

This work tackles the challenge of exhaustively exploring the Calabi-Yau landscape arising from FRSTs of 4-dimensional reflexive polytopes, a combinatorially vast problem that defies full enumeration. It introduces CYTransformer, an encoder–decoder transformer that learns to generate FRSTs from polytope data and can improve itself by retraining on its own outputs, enabling scalable, self-directed exploration. The study demonstrates that CYTransformer can efficiently produce unbiased, representative samples of FRSTs across polytopes of increasing complexity and even surpass traditional non-learning fast samplers in many regimes, especially for large FRST spaces. To harness these capabilities, the authors propose AICY, a living platform that combines software tools, self-improving models, and a growing database to systematically map and catalog the Calabi-Yau landscape, with potential for targeted searches guided by physics-informed reward signals or task-specific objectives. Together, these results offer a scalable, community-driven approach to navigating string-theoretic geometries with implications for landscape statistics and phenomenology.

Abstract

Fine, regular, and star triangulations (FRSTs) of four-dimensional reflexive polytopes give rise to toric varieties, within which generic anticanonical hypersurfaces yield smooth Calabi-Yau threefolds. We introduce CYTransformer, a deep learning model based on the transformer architecture, to automate the generation of FRSTs. We demonstrate that CYTransformer efficiently and unbiasedly samples FRSTs for polytopes across a range of sizes, and can self-improve through retraining on its own output. These results lay the foundation for AICY: a community-driven platform designed to combine self-improving machine learning models with a continuously expanding database to explore and catalog the Calabi-Yau landscape.

Transforming Calabi-Yau Constructions: Generating New Calabi-Yau Manifolds with Transformers

TL;DR

This work tackles the challenge of exhaustively exploring the Calabi-Yau landscape arising from FRSTs of 4-dimensional reflexive polytopes, a combinatorially vast problem that defies full enumeration. It introduces CYTransformer, an encoder–decoder transformer that learns to generate FRSTs from polytope data and can improve itself by retraining on its own outputs, enabling scalable, self-directed exploration. The study demonstrates that CYTransformer can efficiently produce unbiased, representative samples of FRSTs across polytopes of increasing complexity and even surpass traditional non-learning fast samplers in many regimes, especially for large FRST spaces. To harness these capabilities, the authors propose AICY, a living platform that combines software tools, self-improving models, and a growing database to systematically map and catalog the Calabi-Yau landscape, with potential for targeted searches guided by physics-informed reward signals or task-specific objectives. Together, these results offer a scalable, community-driven approach to navigating string-theoretic geometries with implications for landscape statistics and phenomenology.

Abstract

Fine, regular, and star triangulations (FRSTs) of four-dimensional reflexive polytopes give rise to toric varieties, within which generic anticanonical hypersurfaces yield smooth Calabi-Yau threefolds. We introduce CYTransformer, a deep learning model based on the transformer architecture, to automate the generation of FRSTs. We demonstrate that CYTransformer efficiently and unbiasedly samples FRSTs for polytopes across a range of sizes, and can self-improve through retraining on its own output. These results lay the foundation for AICY: a community-driven platform designed to combine self-improving machine learning models with a continuously expanding database to explore and catalog the Calabi-Yau landscape.

Paper Structure

This paper contains 32 sections, 11 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: CYTransformer architecture. The high-level pipeline for our model in inference mode. The encoder processes the input polytope, as a sequence of four-dimensional vertex vectors, into a latent representation. The decoder autoregressively generates tokens, representing simplices, conditioned on both the encoder output and previously generated tokens, sampling from the predicted token distribution $\boldsymbol{P}$ until the end-of-sequence token <eos> is drawn.
  • Figure 2: Encoder and decoder layers (illustration from encoderdecodersource) . The encoder layer applies multi-headed self-attention over the input sequence, followed by a feed-forward network. The decoder layer applies masked self-attention (to preserve autoregressive generation), followed by cross-attention to the encoder outputs, and then a feed-forward network. This structure enables conditioning the output sequence on the full input polytope while generating tokens autoregressively.
  • Figure 3: CYTransformer \ref{['met:ttfrstgencurve']}. Each plot shows the number of distinct FRSTs generated by CYTransformer during training, measured across $1{,}600$ (for $(h^{1,1},N_{\rm vert})=(5,9+1)$ to $(8,12+1)$) or $6{,}400$ (for $(9,13+1)$ and $(10,14+1)$) candidate triangulations, as a function of training step. Each curve within a plot corresponds to a model trained on a different-sized dataset. For example, the label $(2000,6)$ indicates a training set of $2{,}000$ polytopes, each contributing up to $6$ FRSTs (or fewer, if the polytope admits less). The label "all" refers to using all available FRSTs for each polytope in the training set, which, depending on the configuration, may either mean the full set of enumerated FRSTs or a capped number generated during data preparation (see sections \ref{['subsec:datasets']} and \ref{['subsec:experimental_setup']}). As expected, performance generally improves with a larger training set.
  • Figure 4: Generation efficiency of CYTransformer across polytopes of increasing complexity. Each panel corresponds to a fixed $(h^{1,1},N_{\rm vert})$, showing the \ref{['met:frstgencurve']} (top row) and the corresponding \ref{['met:frstgenrate']} (bottom row) as a function of inference calls $N_{\rm guess}$, averaged over $200$ test polytopes. Three curves are shown: all generated FRSTs (dotted), distinct FRSTs (solid), and distinct NTFE FRSTs (dashed). For simpler polytopes (top left), CYTransformer rapidly saturates the FRST space, leading to a steep drop in the distinct generation rate. For more complex polytopes (bottom right), the rate decays more gradually (remains nearly flat for $(10,14+1)$), demonstrating the model’s ability to scale and maintain generative diversity across a vast FRST space. The flatness of the all-FRST rate underscores the model’s stable success probability per candidate triangulation due to the independence of inference calls.
  • Figure 5: Per-polytope \ref{['met:frstreccurve']}. Each panel shows the percentage of all distinct FRSTs recovered as a function of inference calls $N_{\rm guess}$, plotted individually for test polytopes within a fixed $(h^{1,1},N_{\rm vert})$ set. While most polytopes exhibit rapid and high recovery, a noticeable subset show slow or limited recovery, highlighting the influence of polytope geometry on model performance. Some curves plateau early or rise slowly, indicating cases where CYTransformer struggles to learn the full FRST space efficiently. Common in the $(10,14+1)$ case, smooth and gently sloping curves correspond to polytopes with especially large FRST spaces, for which higher sampling budgets are necessary to achieve meaningful recovery.
  • ...and 12 more figures