Table of Contents
Fetching ...

Optimal Inflationary Potentials

Tomás Sousa, Deaglan J. Bartlett, Harry Desmond, Pedro G. Ferreira

TL;DR

This work tackles the underdetermination of inflationary potentials by introducing Exhaustive Symbolic Regression (ESR) to generate all simple single-field, slow-roll potentials from chosen operator bases. Potentials are ranked using the Minimum Description Length (MDL) criterion, balancing data fit against structural complexity, and optionally reweighted with a Katz back-off language model to reflect theoretical motivation. The analysis reveals MDL-optimal forms often involve highly nested exponential structures, with language priors shifting preferences toward more conventional, physically motivated shapes; tensor-to-scalar ratios can remain very small even for relatively simple expressions, and some results approach the sensitivity of upcoming surveys. The approach demonstrates a data-driven, principled route to extract implications for fundamental physics from cosmological data, while highlighting limitations tied to basis choice, prior definitions, and the need for reheating and stability considerations in viable models.

Abstract

Inflation is a highly favoured theory for the early Universe. It is compatible with current observations of the cosmic microwave background and large scale structure and is a driver in the quest to detect primordial gravitational waves. It is also, given the current quality of the data, highly under-determined with a large number of candidate implementations. We use a new method in symbolic regression to generate all possible simple scalar field potentials for one of two possible basis sets of operators. Treating these as single-field, slow-roll inflationary models we then score them with an information-theoretic metric ("minimum description length") that quantifies their efficiency in compressing the information in current data. We explore two possible priors on the parameter space of potentials, one related to the functions' structural complexity and one that uses a Katz back-off language model to prefer functions that may be theoretically motivated. This enables us to identify the inflaton potentials that optimally balance simplicity with accuracy at explaining current data, which may subsequently find theoretical motivation. Our exploratory study opens the door to extraction of fundamental physics directly from data, and may be augmented with more refined theoretical priors in the quest for a complete understanding of the early Universe.

Optimal Inflationary Potentials

TL;DR

This work tackles the underdetermination of inflationary potentials by introducing Exhaustive Symbolic Regression (ESR) to generate all simple single-field, slow-roll potentials from chosen operator bases. Potentials are ranked using the Minimum Description Length (MDL) criterion, balancing data fit against structural complexity, and optionally reweighted with a Katz back-off language model to reflect theoretical motivation. The analysis reveals MDL-optimal forms often involve highly nested exponential structures, with language priors shifting preferences toward more conventional, physically motivated shapes; tensor-to-scalar ratios can remain very small even for relatively simple expressions, and some results approach the sensitivity of upcoming surveys. The approach demonstrates a data-driven, principled route to extract implications for fundamental physics from cosmological data, while highlighting limitations tied to basis choice, prior definitions, and the need for reheating and stability considerations in viable models.

Abstract

Inflation is a highly favoured theory for the early Universe. It is compatible with current observations of the cosmic microwave background and large scale structure and is a driver in the quest to detect primordial gravitational waves. It is also, given the current quality of the data, highly under-determined with a large number of candidate implementations. We use a new method in symbolic regression to generate all possible simple scalar field potentials for one of two possible basis sets of operators. Treating these as single-field, slow-roll inflationary models we then score them with an information-theoretic metric ("minimum description length") that quantifies their efficiency in compressing the information in current data. We explore two possible priors on the parameter space of potentials, one related to the functions' structural complexity and one that uses a Katz back-off language model to prefer functions that may be theoretically motivated. This enables us to identify the inflaton potentials that optimally balance simplicity with accuracy at explaining current data, which may subsequently find theoretical motivation. Our exploratory study opens the door to extraction of fundamental physics directly from data, and may be augmented with more refined theoretical priors in the quest for a complete understanding of the early Universe.
Paper Structure (15 sections, 21 equations, 4 figures, 4 tables)

This paper contains 15 sections, 21 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Pareto front of inflationary potentials found with ESR when compared to the data for the two basis sets. We show the best-fitting functions according to the change in the description length, $L(D)$, (red) and the likelihood, $\mathcal{L}$, (blue) relative to the corresponding minima. More accurate functions appear at lower $|\Delta\log(L)|$, while overall superior functions appear at lower $L(D)$. For the solid line we use the $k\log(n)$ term in the description length to penalise model complexity, while for the dashed line we instead use the Katz language model.
  • Figure 2: Inflationary trajectories of the potentials with the Minimum Description Length for different analyses: (a) The MDL function for both basis sets A and B using the $k\log(n)$ function prior, (b) The MDL potential for basis set A with the language model prior, and (c) The MDL expression for basis set B with the language model. For context, in (d) we give the corresponding plot for the Starobinsky or Higgs inflation, with $\theta_{0}$ optimised to fit $A_S$, and in (e) we give the potential which is at the "knee" of the Pareto front (\ref{['eq:bestfit']}). The shaded region shows the range of $\phi$ during which inflation occurs, where the inflaton rolls from the region of high to low potential.
  • Figure 3: Variation of the predicted tensor-to-scalar ratio, $r$, with complexity. For both sets of basis operators, the lowest achievable $r$ decreases with the complexity of the potential. We also plot the prediction of the MDL potentials in red, where the result using the $k\log(n)$ prior is solid and with the language model is dashed. The blue line shows the predictions of the potentials which maximise the likelihood at a given complexity.
  • Figure 4: The predicted tensor-to-scalar ratio and spectral index for the ten best models in each analysis (\ref{['tab:SetA klogn', 'tab:SetB klogn', 'tab:SetA katz', 'tab:SetB katz']}). The 68% and 95% CL regions from \ref{['eq:n_S', 'eq:r']} are shown in blue. We colour the points by the change in description length relative to the optimal model for each analysis, such that darker points indicate better potentials.