Table of Contents
Fetching ...

Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization

Joshua Drossman, Alexandre Jacquillat, Sébastien Martin

Abstract

Optimization is as much about modeling the right problem as solving it. Identifying the right objectives, constraints, and trade-offs demands extensive interaction between researchers and stakeholders. Large language models can empower decision-makers with optimization capabilities through interactive optimization agents that can propose, interpret and refine solutions. However, it is fundamentally harder to evaluate a conversation-based interaction than traditional one-shot approaches. This paper proposes a scalable and replicable methodology for evaluating optimization agents through conversations. We build LLM-powered decision agents that role-play diverse stakeholders, each governed by an internal utility function but communicating like a real decision-maker. We generate thousands of conversations in a school scheduling case study. Results show that one-shot evaluation is severely limiting: the same optimization agent converges to much higher-quality solutions through conversations. Then, this paper uses this methodology to demonstrate that tailored optimization agents, endowed with domain-specific prompts and structured tools, can lead to significant improvements in solution quality in fewer interactions, as compared to general-purpose chatbots. These findings provide evidence of the benefits of emerging solutions at the AI-optimization interface to expand the reach of optimization technologies in practice. They also uncover the impact of operations research expertise to facilitate interactive deployments through the design of effective and reliable optimization agents.

Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization

Abstract

Optimization is as much about modeling the right problem as solving it. Identifying the right objectives, constraints, and trade-offs demands extensive interaction between researchers and stakeholders. Large language models can empower decision-makers with optimization capabilities through interactive optimization agents that can propose, interpret and refine solutions. However, it is fundamentally harder to evaluate a conversation-based interaction than traditional one-shot approaches. This paper proposes a scalable and replicable methodology for evaluating optimization agents through conversations. We build LLM-powered decision agents that role-play diverse stakeholders, each governed by an internal utility function but communicating like a real decision-maker. We generate thousands of conversations in a school scheduling case study. Results show that one-shot evaluation is severely limiting: the same optimization agent converges to much higher-quality solutions through conversations. Then, this paper uses this methodology to demonstrate that tailored optimization agents, endowed with domain-specific prompts and structured tools, can lead to significant improvements in solution quality in fewer interactions, as compared to general-purpose chatbots. These findings provide evidence of the benefits of emerging solutions at the AI-optimization interface to expand the reach of optimization technologies in practice. They also uncover the impact of operations research expertise to facilitate interactive deployments through the design of effective and reliable optimization agents.

Paper Structure

This paper contains 44 sections, 5 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Visualization of approaches to optimization in practice.
  • Figure 2: Example of LLM-based autoformulation: a school principal describes an optimization problem in precise technical language and the tool returns a mathematical formulation of the problem.
  • Figure 3: Example of a conversational interaction: the principal is far less precise in their goals and requirements and the tool navigates the problem's landscape to guide them to a good solution.
  • Figure 4: LLM-optimization evaluation frameworks. One-shot evaluation: an optimization agent creates a formulation based on one question, and aims to match a ground truth. Empirical evaluation: an optimization agent is scored based on interactions with human decision-makers. Our approach: an optimization agent is scored based on interactions with decision agents based on verifiable and rigorous criteria.
  • Figure 5: The optimization agent's architecture includes an LLM, prompt, and toolkit, allowing the agent to read/write (R/W) from/to the model, pass the model to the solver, derive its output, and summarize the solution for the decision-maker. This illustration corresponds to the T-P-P design.
  • ...and 6 more figures