SOCRATES: Simulation Optimization with Correlated Replicas and Adaptive Trajectory Evaluations
Haoting Zhang, Haoxian Chen, Donglin Zhan, Hanyang Zhao, Henry Lam, Wenpin Tang, David Yao, Zeyu Zheng
TL;DR
SOCRATES tackles tailoring expensive-to-evaluate simulation-optimization (SO) by creating an ensemble of Operational AI Replicas (OARs) through LLM-guided causal skeleton inference and EM-type learning. A second LLM acts as a meta-optimizer, assessing baseline SO algorithms on the OAR testbed and composing a final adaptive schedule $\pi=\left((a_j, T_j)\right)$ under budget $T$, with online adaptation when real-system performance diverges. The OAR ensemble is made robust via multi-start learning and selection of top-$K$ replicas with ensemble weights $w_k^\star \propto \frac{1}{\text{MSE}_k+\epsilon}$, enabling Bayesian-model-averaging–style resilience. A data-constructive meta-learning framework over OAR ensembles supports trajectory-based evaluation and semantic guidance from LLMs, demonstrated conceptually on canonical SO problems with improved robustness and sample efficiency when switching among complementary algorithm families.
Abstract
The field of simulation optimization (SO) encompasses various methods developed to optimize complex, expensive-to-sample stochastic systems. Established methods include, but are not limited to, ranking-and-selection for finite alternatives and surrogate-based methods for continuous domains, with broad applications in engineering and operations management. The recent advent of large language models (LLMs) offers a new paradigm for exploiting system structure and automating the strategic selection and composition of these established SO methods into a tailored optimization procedure. This work introduces SOCRATES (Simulation Optimization with Correlated Replicas and Adaptive Trajectory Evaluations), a novel two-stage procedure that leverages LLMs to automate the design of tailored SO algorithms. The first stage constructs an ensemble of digital replicas of the real system. An LLM is employed to implement causal discovery from a textual description of the system, generating a structural `skeleton' that guides the sample-efficient learning of the replicas. In the second stage, this replica ensemble is used as an inexpensive testbed to evaluate a set of baseline SO algorithms. An LLM then acts as a meta-optimizer, analyzing the performance trajectories of these algorithms to iteratively revise and compose a final, hybrid optimization schedule. This schedule is designed to be adaptive, with the ability to be updated during the final execution on the real system when the optimization performance deviates from expectations. By integrating LLM-driven reasoning with LLM-assisted trajectory-aware meta-optimization, SOCRATES creates an effective and sample-efficient solution for complex SO optimization problems.
