Table of Contents
Fetching ...

Solver-Independent Automated Problem Formulation via LLMs for High-Cost Simulation-Driven Design

Yuchen Li, Handing Wang, Bing Xue, Mengjie Zhang, Yaochu Jin

TL;DR

APF tackles translating ambiguous natural-language design requirements into executable optimization formulations for high-cost simulation-driven design by introducing a solver-independent, LLM-based pipeline. It combines automated data generation, test-instance annotation, and ranking-based evaluation to create a high-quality fine-tuning dataset, enabling open-source LLMs to robustly produce accurate formulations. A solver-independent evaluation via test-instance rankings replaces expensive solver feedback, and data augmentation plus selection further boost generalization. Case studies in antenna design show APF outperforms prompting baselines and large models in formulation quality and end-task performance, indicating practical impact for industrial design workflows.

Abstract

In the high-cost simulation-driven design domain, translating ambiguous design requirements into a mathematical optimization formulation is a bottleneck for optimizing product performance. This process is time-consuming and heavily reliant on expert knowledge. While large language models (LLMs) offer potential for automating this task, existing approaches either suffer from poor formalization that fails to accurately align with the design intent or rely on solver feedback for data filtering, which is unavailable due to the high simulation costs. To address this challenge, we propose APF, a framework for solver-independent, automated problem formulation via LLMs designed to automatically convert engineers' natural language requirements into executable optimization models. The core of this framework is an innovative pipeline for automatically generating high-quality data, which overcomes the difficulty of constructing suitable fine-tuning datasets in the absence of high-cost solver feedback with the help of data generation and test instance annotation. The generated high-quality dataset is used to perform supervised fine-tuning on LLMs, significantly enhancing their ability to generate accurate and executable optimization problem formulations. Experimental results on antenna design demonstrate that APF significantly outperforms the existing methods in both the accuracy of requirement formalization and the quality of resulting radiation efficiency curves in meeting the design goals.

Solver-Independent Automated Problem Formulation via LLMs for High-Cost Simulation-Driven Design

TL;DR

APF tackles translating ambiguous natural-language design requirements into executable optimization formulations for high-cost simulation-driven design by introducing a solver-independent, LLM-based pipeline. It combines automated data generation, test-instance annotation, and ranking-based evaluation to create a high-quality fine-tuning dataset, enabling open-source LLMs to robustly produce accurate formulations. A solver-independent evaluation via test-instance rankings replaces expensive solver feedback, and data augmentation plus selection further boost generalization. Case studies in antenna design show APF outperforms prompting baselines and large models in formulation quality and end-task performance, indicating practical impact for industrial design workflows.

Abstract

In the high-cost simulation-driven design domain, translating ambiguous design requirements into a mathematical optimization formulation is a bottleneck for optimizing product performance. This process is time-consuming and heavily reliant on expert knowledge. While large language models (LLMs) offer potential for automating this task, existing approaches either suffer from poor formalization that fails to accurately align with the design intent or rely on solver feedback for data filtering, which is unavailable due to the high simulation costs. To address this challenge, we propose APF, a framework for solver-independent, automated problem formulation via LLMs designed to automatically convert engineers' natural language requirements into executable optimization models. The core of this framework is an innovative pipeline for automatically generating high-quality data, which overcomes the difficulty of constructing suitable fine-tuning datasets in the absence of high-cost solver feedback with the help of data generation and test instance annotation. The generated high-quality dataset is used to perform supervised fine-tuning on LLMs, significantly enhancing their ability to generate accurate and executable optimization problem formulations. Experimental results on antenna design demonstrate that APF significantly outperforms the existing methods in both the accuracy of requirement formalization and the quality of resulting radiation efficiency curves in meeting the design goals.

Paper Structure

This paper contains 34 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Formalizing Requirements in High-Cost Simulation-Driven Design: From Manual Expertise to LLM-based Workflow
  • Figure 2: Overview of the APF framework. (a) Data Generation: Design requirements are derived from the simulation dataset and rewritten by the LLM to produce corresponding model equations. (b) Test Instance Annotation: For each requirement, a set of test instances is generated and annotated with reference rankings by the LLM. (c) Data Evaluation and Selection: Generated equations are evaluated against LLM-based rankings, and high-quality samples are selected to construct the training set. (d) Supervised Fine-Tuning: Dataset is used to fine-tune an open-source LLM, significantly enhancing its capability to generate accurate and executable design formulations.
  • Figure 3: An example radiation efficiency curve is divided into five key frequency bands, each with distinct design requirements.
  • Figure 4: The distribution of quality scores for the samples.
  • Figure 5: Comparison of radiation efficiency curves optimized using formulations generated by different methods.