MadEvolve: Evolutionary Optimization of Cosmological Algorithms with Large Language Models

Tianyi Li; Shihui Zang; Moritz Münchmeyer

MadEvolve: Evolutionary Optimization of Cosmological Algorithms with Large Language Models

Tianyi Li, Shihui Zang, Moritz Münchmeyer

TL;DR

The code, MadEvolve, is similar to Google's AlphaEvolve, but places a stronger emphasis on free parameters and their optimization, and applies it to three problems in computational cosmology.

Abstract

We develop a general framework to discover scientific algorithms and apply it to three problems in computational cosmology. Our code, MadEvolve, is similar to Google's AlphaEvolve, but places a stronger emphasis on free parameters and their optimization. Our code starts with a baseline human algorithm implementation, and then optimizes its performance metrics by making iterative changes to its code. As a further convenient feature, MadEvolve automatically generates a report that compares the input algorithm with the evolved algorithm, describes the algorithmic innovations and lists the free parameters and their function. Our code supports both auto-differentiable, gradient-based parameter optimization and gradient-free optimization methods. We apply MadEvolve to the reconstruction of cosmological initial conditions, 21cm foreground contamination reconstruction and effective baryonic physics in N-body simulations. In all cases, we find substantial improvements over the base algorithm. We make MadEvolve and our three tasks publicly available at madevolve.org.

MadEvolve: Evolutionary Optimization of Cosmological Algorithms with Large Language Models

TL;DR

The code, MadEvolve, is similar to Google's AlphaEvolve, but places a stronger emphasis on free parameters and their optimization, and applies it to three problems in computational cosmology.

Abstract

Paper Structure (62 sections, 14 equations, 9 figures, 8 tables)

This paper contains 62 sections, 14 equations, 9 figures, 8 tables.

Introduction
Related Work
Program Synthesis and Repair with LLMs and Agents
Evolutionary and Quality–Diversity (QD) Search
Automated Discovery in the Physical Sciences
The MadEvolve Framework
Overview
Core Architecture and Evolution Loop
Parameter Tracking and Optimization
Greedy One-Shot Optimization
Auto-differentiable Parameter Optimization
Choosing Between Strategies
Hybrid Population Management
LLM Ensemble and Generation Strategies
Automated Report Generation
...and 47 more sections

Figures (9)

Figure 1: Overview of the LLM-driven evolutionary pipeline in MadEvolve. The cycle proceeds as a closed loop: the Prompt Sampler retrieves parent programs from the Program DB to query the LLM Ensemble. The generated code diffs first undergo Parameter Optimization to refine continuous variables before being submitted to the Evaluator Pool. Evaluation scores update the database, completing the iteration. Finally, the best solution is transmitted to the central Report Generator to produce human-readable explanations of the discovered innovations.
Figure 2: Evolution dynamics of BAO reconstruction algorithm discovery over 1,165 generations. The blue curve tracks the best training-set score $\bar{r}_{\text{BAO}}$ achieved by any program in the population, while the green curve indicates the number of autodiff-optimizable parameters in the best-performing algorithm. The red dashed line marks the initial baseline performance ($\bar{r}_{\text{BAO}} = 0.752$) achieved by the two-parameter Zel'dovich method. The evolutionary trajectory exhibits distinct phases: rapid initial improvement during the first 100 generations, a gradual increase through $\bar{r}_{\text{BAO}} \approx 0.85$--$0.88$ as parameter count stabilizes near 10, a notable jump around generation 600--650, and continued refinement toward the final best score of $\bar{r}_{\text{BAO}} = 0.924$ found at generation 1,144. The +22.8% improvement demonstrates sustained evolutionary progress through the combination of LLM-proposed structural modifications and autodiff parameter optimization.
Figure 3: A comparison of reconstruction performance for different algorithms. The results are averaged over nine simulations in the test set ( fiducial 1-9). The curves shows the cross-correlation function of dark matter field (black), standard reconstruction result (red), iterative reconstruction result (green), evolved algorithm result based on standard reconstruction (blue), evolved algorithm result based on iterative reconstruction (orange), and non-linear reconstruction result (purple) with respect to the initial condition. Left: Reconstruction is executed on a $256^3$ mesh (which we evolved on). Right: Reconstruction is executed on a $512^3$ mesh (which we did not evolve on). The shaded region corresponds to Fourier modes beyond the Nyquist frequency in the evolving configuration.
Figure 4: Evolution dynamics for the BAO reconstruction run starting from the iterative reconstruction baseline Schmittfull2017. The blue curve tracks the best $\bar{r}_{\text{BAO}}$ achieved by any program in the population across 386 generations. Starting from $\bar{r}_{\text{BAO}} = 0.933$, the evolutionary search discovers differentiable refinements that push performance to $\bar{r}_{\text{BAO}} = 0.959$ ($+2.8\%$), demonstrating that the framework can improve upon an already well-optimized reconstruction algorithm.
Figure 5: The cross-correlation coefficient $r(k_\perp, k_\parallel)$ of the intensity map with respect to the underlying matter field. The results are averaged over nine simulations in the test set ( fiducial 1-9).
...and 4 more figures

MadEvolve: Evolutionary Optimization of Cosmological Algorithms with Large Language Models

TL;DR

Abstract

MadEvolve: Evolutionary Optimization of Cosmological Algorithms with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (9)