Table of Contents
Fetching ...

Using Combinatorial Optimization to Design a High quality LLM Solution

Samuel Ackerman, Eitan Farchi, Rami Katan, Orna Raz

TL;DR

The paper tackles the challenge of designing high-quality LLM-based solutions by systematically designing benchmark data inputs and exploring the design space efficiently. It combines combinatorial optimization to curate a small plan $P$ that covers low-order interactions and uses ANOVA and logistic regression to identify dominating factors, enabling data-efficient evaluation with minimal human effort. Key contributions include a formal plan-design framework, explicit incorporation of expert knowledge into factor selection, and a methodology for distinguishing influential design choices under a potentially $2\times10^5$-way design space. The approach provides a practical baseline for evaluating autoML-style searches in LLM pipelines and supports rapid revalidation when new LLM models become available.

Abstract

We introduce a novel LLM based solution design approach that utilizes combinatorial optimization and sampling. Specifically, a set of factors that influence the quality of the solution are identified. They typically include factors that represent prompt types, LLM inputs alternatives, and parameters governing the generation and design alternatives. Identifying the factors that govern the LLM solution quality enables the infusion of subject matter expert knowledge. Next, a set of interactions between the factors are defined and combinatorial optimization is used to create a small subset $P$ that ensures all desired interactions occur in $P$. Each element $p \in P$ is then developed into an appropriate benchmark. Applying the alternative solutions on each combination, $p \in P$ and evaluating the results facilitate the design of a high quality LLM solution pipeline. The approach is especially applicable when the design and evaluation of each benchmark in $P$ is time-consuming and involves manual steps and human evaluation. Given its efficiency the approach can also be used as a baseline to compare and validate an autoML approach that searches over the factors governing the solution.

Using Combinatorial Optimization to Design a High quality LLM Solution

TL;DR

The paper tackles the challenge of designing high-quality LLM-based solutions by systematically designing benchmark data inputs and exploring the design space efficiently. It combines combinatorial optimization to curate a small plan that covers low-order interactions and uses ANOVA and logistic regression to identify dominating factors, enabling data-efficient evaluation with minimal human effort. Key contributions include a formal plan-design framework, explicit incorporation of expert knowledge into factor selection, and a methodology for distinguishing influential design choices under a potentially -way design space. The approach provides a practical baseline for evaluating autoML-style searches in LLM pipelines and supports rapid revalidation when new LLM models become available.

Abstract

We introduce a novel LLM based solution design approach that utilizes combinatorial optimization and sampling. Specifically, a set of factors that influence the quality of the solution are identified. They typically include factors that represent prompt types, LLM inputs alternatives, and parameters governing the generation and design alternatives. Identifying the factors that govern the LLM solution quality enables the infusion of subject matter expert knowledge. Next, a set of interactions between the factors are defined and combinatorial optimization is used to create a small subset that ensures all desired interactions occur in . Each element is then developed into an appropriate benchmark. Applying the alternative solutions on each combination, and evaluating the results facilitate the design of a high quality LLM solution pipeline. The approach is especially applicable when the design and evaluation of each benchmark in is time-consuming and involves manual steps and human evaluation. Given its efficiency the approach can also be used as a baseline to compare and validate an autoML approach that searches over the factors governing the solution.
Paper Structure (13 sections, 2 figures, 2 tables)