Large Language Model-Based Automatic Formulation for Stochastic Optimization Models
Amirreza Talebi
TL;DR
The paper investigates using large language models (LLMs) to automatically formulate and solve stochastic optimization problems from natural-language descriptions, focusing on three problem classes: individual and joint chance-constrained programs and two-stage stochastic MILP (SMILP-2). It introduces a hybrid prompting framework with multiple specialized agents and a soft-scoring metric to evaluate algebraic and structural correctness beyond exact canonical matches. Empirical results across GPT-3.5 and GPT-4 variants show that structured, multi-agent prompting (especially with GPT-4{-}Turbo) yields higher partial and constraint-satisfaction scores, while objective matching remains challenging. The work lays the groundwork for language-driven, testable pipelines to generate and validate stochastic optimization formulations, with potential for real-world adoption in supply chains and decision-making under uncertainty.
Abstract
This paper presents an integrated systematic study of the performance of large language models (LLMs), specifically ChatGPT, for automatically formulating and solving Stochastic Optimization (SO) problems from natural language descriptions. Focusing on three key categories, individual chance-constrained models, joint chance-constrained models, and two-stage stochastic mixed-integer linear programming models, we design several prompts that guide ChatGPT through structured tasks using chain-of-thought and agentic reasoning. We introduce a novel soft-scoring metric that evaluates the structural quality and partial correctness of generated models, addressing the limitations of canonical and execution-based accuracy metrics. Across a diverse set of SO problems, GPT-4-Turbo achieves better partial scores than GPT-3.5 variants except for individual chance-constrained problems. Structured prompts significantly outperform simple prompting, reducing extra-element generation and improving objective matching, although extra-element generation remains a nontrivial task. Our findings reveal that with well-engineered prompts and multi-agent collaboration, LLMs can facilitate SO formulations, paving the way for intelligent, language-driven modeling pipelines for SO in practice.
