Interpretable Analytic Calabi-Yau Metrics via Symbolic Distillation
D Yang Eng
TL;DR
This work demonstrates that a compact analytic model can faithfully reproduce neural surrogates for Calabi–Yau metric determinants governed by the Monge–Ampère PDE. By distilling a five-term formula in terms of gauge-invariant invariants $p_2$ and $\sigma_3$, the authors achieve $R^2\approx0.9994$ with a $3{,}000\times$ reduction in parameters and robust validity across the Dwork family moduli range via volume and Yukawa benchmarks. The functional form remains stable as moduli change, with coefficients $c_i(\psi)$ varying smoothly; singular terms capture essential geometric corrections, revealing a hierarchical modulation that mirrors PDE-constrained structure. The approach significantly accelerates physics calculations by enabling microsecond evaluations, facilitating large-scale moduli scans with practical accuracy limits set by teacher noise. This work also connects with concurrent efforts on symbolic representations of Kahler potentials, highlighting a general principle: fixed PDE structure yields a low-dimensional, interpretable manifold for complex geometric observables.
Abstract
Calabi--Yau manifolds are essential for string theory but require computing intractable metrics. Here we show that symbolic regression can distill neural approximations into simple, interpretable formulas. Our five-term expression matches neural accuracy ($R^2 = 0.9994$) with 3,000-fold fewer parameters. Multi-seed validation confirms that geometric constraints select essential features, specifically power sums and symmetric polynomials, while permitting structural diversity. The functional form can be maintained across the studied moduli range ($ψ\in [0, 0.8]$) with coefficients varying smoothly; we interpret these trends as empirical hypotheses within the accuracy regime of the locally-trained teachers ($σ\approx 8-9\%$ at $ψ\neq 0$). The formula reproduces physical observables -- volume integrals and Yukawa couplings -- validating that symbolic distillation recovers compact, interpretable models for quantities previously accessible only to black-box networks.
