Parameter compression in the flux landscape

Aman Chauhan; Michele Cicoli; Sven Krippendorf; Anshuman Maharana; Pellegrino Piantadosi; Andreas Schachner

Parameter compression in the flux landscape

Aman Chauhan, Michele Cicoli, Sven Krippendorf, Anshuman Maharana, Pellegrino Piantadosi, Andreas Schachner

Abstract

We present a data-driven investigation of the exhaustive ensemble of no-scale type IIB flux vacua constructed in \cite{Chauhan:2025rdj}. Using a combination of linear and non-linear dimensionality-reduction techniques, we analyse both flux and moduli spaces and demonstrate that the effective dimensionality of the underlying 12-dimensional flux space is substantially reduced. A central component of our study is a physics-informed autoencoder, which provides a non-linear compression of the flux and moduli data into a low-dimensional latent space. The learned latent representation organises vacua according to desired features and, in particular, isolates distinguished regions associated with small values of the flux superpotential $|W_0|$, revealing non-trivial correlations that are not captured by linear methods. In parallel, we apply tools from topological data analysis, specifically persistent homology, to probe the global structure of the vacuum distribution. This allows us to identify robust, long-lived topological features in both moduli and flux subspaces. This work is a necessary step for developing foundation models in string phenomenology.

Parameter compression in the flux landscape

Abstract

, revealing non-trivial correlations that are not captured by linear methods. In parallel, we apply tools from topological data analysis, specifically persistent homology, to probe the global structure of the vacuum distribution. This allows us to identify robust, long-lived topological features in both moduli and flux subspaces. This work is a necessary step for developing foundation models in string phenomenology.

Paper Structure (11 sections, 22 equations, 9 figures, 5 tables, 2 algorithms)

This paper contains 11 sections, 22 equations, 9 figures, 5 tables, 2 algorithms.

Introduction
Type IIB flux compactification
Principal Component Analysis
Topological Data Analysis
Persistent homology
Persistent homology on moduli space
Persistent homology on flux space
Autoencoder latent representation
Conclusion and Summary
PCA Details
TDA on dataset A

Figures (9)

Figure 1: Flux configurations projected onto the first and second principal components are shown for dataset A (left column) and dataset B (right column). The lower panels, together with the colour coding in the upper panels, display the corresponding distributions of the superpotential along these principal directions.
Figure 2: Scatter plots of the Euclidean norms of the NS--NS and R--R flux vectors, $\|h\|$ and $\|f\|$, for flux vacua in datasets A and B. The points are colour-coded by $\log_{10}|W_0|$.
Figure 3: Distributions of the moduli for varying values of the flux-induced D3-brane charge $N_{\text{flux}}$ in dataset B: (top left) projection onto the $z^1$-plane, (bottom left) projection onto the $z^2$-plane, and (right) projection onto the $\tau$-plane. In each case, the distributions exhibit reflection symmetry about the imaginary axis and display non-trivial geometric patterns, including arc-like structures and closed loops.
Figure 4: Persistent homology analysis of the vacua in dataset B: (top left) projection onto the $z^1$-plane, (top right) projection onto the $z^2$-plane, (bottom left) projection onto the $\tau$-plane, and (bottom right) the full six-dimensional $(z^1,z^2,\tau)$ moduli space. Homology classes are colour-coded by degree, with $H_0$ (connected components) shown in blue, $H_1$ (one-cycles) in orange, and $H_2$ (two-cycles) in green. The most persistent $H_1$ class is highlighted by $\star$.
Figure 5: Persistent homology of the flux configurations in dataset B: (top left) $h$-flux subspace, (top right) $f$-flux subspace, and (bottom left) the combined $(f,h)$ flux space. The $f$-flux subspace exhibits a larger number of $H_1$ classes (orange) and $H_0$ classes (blue), attributable to the broader range of flux quanta and the increased multiplicity of distinct flux configurations, leading to a more pronounced lattice-like organisation in flux space. Bottom right: Persistence diagram for a reference ensemble generated via empirical coordinate-wise sampling, preserving the marginal integer distributions of the $f$-flux coordinates while removing inter-coordinate correlations.
...and 4 more figures

Parameter compression in the flux landscape

Abstract

Parameter compression in the flux landscape

Authors

Abstract

Table of Contents

Figures (9)