LLM-Guided Evolutionary Program Synthesis for Quasi-Monte Carlo Design
Amir Sadikov
TL;DR
This paper tackles two core QMC design problems—constructing finite point sets with minimal star discrepancy and optimizing Sobol' direction numbers—by casting them as program-synthesis tasks solved through an LLM-guided evolutionary loop within the OpenEvolve framework. The approach yields new low-discrepancy 2D/3D point configurations that outperform prior benchmarks and discovers Sobol' parameters that reduce rQMC mean-squared error in 32-dimensional option pricing tests, while remaining extensible to arbitrary $N$ and compatible with standard randomizations. Key contributions include a two-phase discovery strategy (direct construction followed by iterative refinement), rigorous discrepancy evaluation, and robust empirical validation across multiple high-dimensional problems. The results demonstrate that LLM-driven evolution can automate high-quality QMC design, recovering classical constructions when optimal and surpassing them when finite-$N$ structure matters, with open data and code for reproducibility.
Abstract
Low-discrepancy point sets and digital sequences underpin quasi-Monte Carlo (QMC) methods for high-dimensional integration. We cast two long-standing QMC design problems as program synthesis and solve them with an LLM-guided evolutionary loop that mutates and selects code under task-specific fitness: (i) constructing finite 2D/3D point sets with low star discrepancy, and (ii) choosing Sobol' direction numbers that minimize randomized QMC error on downstream integrands. Our two-phase procedure combines constructive code proposals with iterative numerical refinement. On finite sets, we rediscover known optima in small 2D cases and set new best-known 2D benchmarks for N >= 40, while matching most known 3D optima up to the proven frontier (N <= 8) and reporting improved 3D benchmarks beyond. On digital sequences, evolving Sobol' parameters yields consistent reductions in randomized quasi-Monte Carlo (rQMC) mean-squared error for several 32-dimensional option-pricing tasks relative to widely used Joe--Kuo parameters, while preserving extensibility to any sample size and compatibility with standard randomizations. Taken together, the results demonstrate that LLM-driven evolutionary program synthesis can automate the discovery of high-quality QMC constructions, recovering classical designs where they are optimal and improving them where finite-N structure matters. Data and code are available at https://github.com/hockeyguy123/openevolve-star-discrepancy.git.
