OptiMUS: Optimization Modeling Using MIP Solvers and large language models

Ali AhmadiTeshnizi; Wenzhi Gao; Madeleine Udell

OptiMUS: Optimization Modeling Using MIP Solvers and large language models

Ali AhmadiTeshnizi, Wenzhi Gao, Madeleine Udell

TL;DR

OptiMUS presents an end-to-end LLM-driven agent that translates natural language optimization requests into MILP/LP formulations, solver code, and validated solutions. By introducing SNOP as a structured NL representation and augmenting with automated tests and problem rephrasings, it integrates LLM reasoning with traditional solvers (Gurobi, cvxpy) to improve solve rates. The NLP4LP dataset provides 52 benchmark instances to evaluate NL-to-optimization pipelines. Empirical results show substantial performance gains over naive prompting, particularly with debugging, automated testing, and augmentation, highlighting the approach's potential to broaden access to optimization across domains.

Abstract

Optimization problems are pervasive across various sectors, from manufacturing and distribution to healthcare. However, most such problems are still solved heuristically by hand rather than optimally by state-of-the-art solvers, as the expertise required to formulate and solve these problems limits the widespread adoption of optimization tools and techniques. We introduce OptiMUS, a Large Language Model (LLM)-based agent designed to formulate and solve MILP problems from their natural language descriptions. OptiMUS is capable of developing mathematical models, writing and debugging solver code, developing tests, and checking the validity of generated solutions. To benchmark our agent, we present NLP4LP, a novel dataset of linear programming (LP) and mixed integer linear programming (MILP) problems. Our experiments demonstrate that OptiMUS solves nearly twice as many problems as a basic LLM prompting strategy. OptiMUS code and NLP4LP dataset are available at \href{https://github.com/teshnizi/OptiMUS}{https://github.com/teshnizi/OptiMUS}

OptiMUS: Optimization Modeling Using MIP Solvers and large language models

TL;DR

Abstract

Paper Structure (13 sections, 1 equation, 16 figures)

This paper contains 13 sections, 1 equation, 16 figures.

Introduction
Challenges of Optimization Modeling using LLMs
Methodology
Structured Natural language Optimization Problem (SNOP)
Formulation
Code Generation
Tests and revision
Augmentation
Dataset
Experiments and Analysis
Related Work
Conclusion
Appendix

Figures (16)

Figure 1: An illustration explaining how OptiMUS uses various components to effectively model and solve optimization problems. First, a mathematical formulation is generated from the problem representation. The solver code is then generated based on the formulation. The code is executed to generate and save a solution to file. The solution is then tested on a set of unit tests generated by the LLM and revised by the user. If the code does not run or fails the tests it is passed to the LLMs along for the relevant error code for revision until it is fixed (dashed lines might be executed multiple times). Otherwise, the output is selected as the final solution. An example template is shown in the bottom left corner.
Figure 2: Scaling OptiMUS to problems with large numerical data: Instead of passing everything to the LLM directly (left), in OptiMUS we separate the numerical data from the problem description and give the metadata to the LLM (right). The LLM then writes a code to interact with the data file.
Figure 3: a). An example of a real-world optimization problem and a SNOP representation for it. b). An example markdown formulation of a problem generated by OptiMUS. c) Example rephrasings generated by OptiMUS from a problem info statement in the augmentation process.
Figure 4: OptiMUS prompts include instructions to avoid common coding mistakes. For example, ChatGPT commonly uses cvxpy.sum on generator objects instead of lists. Adding the instruction "- cvxpy.sum takes a list as input, and not a generator" to the code generation template reduces the incidence of this mistake. Top) generated code before the instruction; Bottom) generated code after adding the instruction.
Figure 5: LLM can be used to generate tests and check the correctness of the output. After feeding the problem to an LLM using the test generation template, the model generates a script that checks the correctness of output (constraint satisfaction, output format, etc.) and returns appropriate error message if it finds a problem. The error message is used to automatically fix the code.
...and 11 more figures

OptiMUS: Optimization Modeling Using MIP Solvers and large language models

TL;DR

Abstract

OptiMUS: Optimization Modeling Using MIP Solvers and large language models

Authors

TL;DR

Abstract

Table of Contents

Figures (16)