Deep Learning Calabi-Yau four folds with hybrid and recurrent neural network architectures

H. L. Dao

Deep Learning Calabi-Yau four folds with hybrid and recurrent neural network architectures

H. L. Dao

TL;DR

The work investigates recurrent neural architectures for learning Calabi–Yau four-fold Hodge numbers on the CICY4 dataset, challenging CNN-centric approaches by showing that hybrid CNN–LSTM and pure LSTM models can achieve near state-of-the-art accuracy with substantially smaller models and shorter training times. Across 72% and 80% data splits, LSTM-based hybrids consistently outperform CNN–GRU hybrids, with the best single-model results reaching around 99.7% for h^{1,1}, 98% for h^{2,1}, 95% for h^{3,1}, and 81–84% for h^{2,2}; ensembles further boost performance, achieving up to about 99.9%, 99.0%, 97.0%, and 87% respectively. The study demonstrates the viability of RNN-based architectures for topological data in string theory, achieving high accuracy with relatively small models and highlighting ensemble gains, while also outlining future directions including transformer-based approaches. These findings hold practical significance for rapid, scalable predictions of CY topological data and suggest broader applicability of sequential-models in geometric ML tasks within string theory.

Abstract

In this work, we report the results of applying deep learning based on hybrid convolutional-recurrent and purely recurrent neural network architectures to the dataset of almost one million complete intersection Calabi-Yau four-folds (CICY4) to machine-learn their four Hodge numbers $h^{1,1}, h^{2,1}, h^{3,1}, h^{2,2}$. In particular, we explored and experimented with twelve different neural network models, nine of which are convolutional-recurrent (CNN-RNN) hybrids with the RNN unit being either GRU (Gated Recurrent Unit) or Long Short Term Memory (LSTM). The remaining four models are purely recurrent neural networks based on LSTM. In terms of the $h^{1,1}, h^{2,1}, h^{3,1}, h^{2,2}$ prediction accuracies, at 72% training ratio, our best performing individual model is CNN-LSTM-400, a hybrid CNN-LSTM with the LSTM hidden size of 400, which obtained 99.74%, 98.07%, 95.19%, 81.01%, our second best performing individual model is LSTM-448, an LSTM-based model with the hidden size of 448, which obtained 99.74%, 97.51%, 94.24%, and 78.63%. These results were improved by forming ensembles of the top two, three or even four models. Our best ensemble, consisting of the top four models, achieved the accuracies of 99.84%, 98.71%, 96.26%, 85.03%. At 80% training ratio, the top two performing models LSTM-448 and LSTM-424 are both LSTM-based with the hidden sizes of 448 and 424. Compared with the 72% training ratio, there is a significant improvement of accuracies, which reached 99.85%, 98.66%, 96.26%, 84.77% for the best individual model and 99.90%, 99.03%, 97.97%, 87.34% for the best ensemble. By nature a proof of concept, the results of this work conclusively established the utility of RNN-based architectures and demonstrated their effective performances compared to the well-explored purely CNN-based architectures in the problem of deep learning Calabi Yau manifolds.

Deep Learning Calabi-Yau four folds with hybrid and recurrent neural network architectures

TL;DR

Abstract

. In particular, we explored and experimented with twelve different neural network models, nine of which are convolutional-recurrent (CNN-RNN) hybrids with the RNN unit being either GRU (Gated Recurrent Unit) or Long Short Term Memory (LSTM). The remaining four models are purely recurrent neural networks based on LSTM. In terms of the

prediction accuracies, at 72% training ratio, our best performing individual model is CNN-LSTM-400, a hybrid CNN-LSTM with the LSTM hidden size of 400, which obtained 99.74%, 98.07%, 95.19%, 81.01%, our second best performing individual model is LSTM-448, an LSTM-based model with the hidden size of 448, which obtained 99.74%, 97.51%, 94.24%, and 78.63%. These results were improved by forming ensembles of the top two, three or even four models. Our best ensemble, consisting of the top four models, achieved the accuracies of 99.84%, 98.71%, 96.26%, 85.03%. At 80% training ratio, the top two performing models LSTM-448 and LSTM-424 are both LSTM-based with the hidden sizes of 448 and 424. Compared with the 72% training ratio, there is a significant improvement of accuracies, which reached 99.85%, 98.66%, 96.26%, 84.77% for the best individual model and 99.90%, 99.03%, 97.97%, 87.34% for the best ensemble. By nature a proof of concept, the results of this work conclusively established the utility of RNN-based architectures and demonstrated their effective performances compared to the well-explored purely CNN-based architectures in the problem of deep learning Calabi Yau manifolds.

Paper Structure (31 sections, 10 equations, 37 figures, 18 tables)

This paper contains 31 sections, 10 equations, 37 figures, 18 tables.

Introduction
Dataset and data preparation
Overview of the CICY4 dataset
Data preparation
Basics of recurrent neural networks
Neural network architectures
CNN-RNN hybrid neural networks
CNN-GRU hybrid
CNN-LSTM hybrid
ResNet-RNN hybrid
LSTM-based neural networks
Training results (using 72% dataset)
CNN-GRU hybrid neural networks
CNN-LSTM hybrid neural networks
LSTM-based neural networks
...and 16 more sections

Figures (37)

Figure 1: Histograms of the four Hodge numbers $h^{1,1}$ (top left), $h^{2,1}$ (top right), $h^{3,1}$ (bottom left), and $h^{2,2}$ (bottom right) for the full, training, validation and test datasets. Figure adapted from Figure 1 of inception.
Figure 2: Histograms of the four Hodge numbers $h^{1,1}$ (top left), $h^{2,1}$ (top right), $h^{3,1}$ (bottom left), and $h^{2,2}$ (bottom right) for the training and test datasets at 72% and 80% data splits.
Figure 5: A schematic diagram of the overall architecture of all neural networks.
Figure 6: A schematic diagram of the architecture of CNN-GRU hybrid neural networks.
Figure 9: A schematic of the architecture of CNN-LSTM hybrid neural networks.
...and 32 more figures

Deep Learning Calabi-Yau four folds with hybrid and recurrent neural network architectures

TL;DR

Abstract

Deep Learning Calabi-Yau four folds with hybrid and recurrent neural network architectures

Authors

TL;DR

Abstract

Table of Contents

Figures (37)