Table of Contents
Fetching ...

Infeasibility Aware Large Language Models for Combinatorial Optimization

Yakun Wang, Min Chen, Zeguan Wu, Junyu Liu, Sitao Zhang, Zhenwen Shao

Abstract

Large language models (LLMs) are increasingly explored for NP-hard combinatorial optimization problems, but most existing methods emphasize feasible-instance solution generation and do not explicitly address infeasibility detection. We propose an infeasibility-aware framework that combines certifiable dataset construction, supervised fine-tuning, and LLM-assisted downstream search. For the minor-embedding problem, we introduce a new mathematical programming formulation together with provable zero-phase infeasibility screening, which enables scalable construction of training instances labeled either as feasible with structured certificates or as certifiably infeasible. Using training data generated through this exact optimization pipeline, we show that an 8B-parameter LLM can be fine-tuned to jointly perform solution generation and infeasibility detection. We further utilize LLM outputs as warm starts for downstream local search, providing a practical way to accelerate optimization even when the LLM outputs are imperfect. Experiments show that our fine-tuned model improves overall accuracy by up to 30\% over GPT-5.2; meanwhile LLM-guided warm starts provide up to $2\times$ speedup compared with starting from scratch in downstream local search.

Infeasibility Aware Large Language Models for Combinatorial Optimization

Abstract

Large language models (LLMs) are increasingly explored for NP-hard combinatorial optimization problems, but most existing methods emphasize feasible-instance solution generation and do not explicitly address infeasibility detection. We propose an infeasibility-aware framework that combines certifiable dataset construction, supervised fine-tuning, and LLM-assisted downstream search. For the minor-embedding problem, we introduce a new mathematical programming formulation together with provable zero-phase infeasibility screening, which enables scalable construction of training instances labeled either as feasible with structured certificates or as certifiably infeasible. Using training data generated through this exact optimization pipeline, we show that an 8B-parameter LLM can be fine-tuned to jointly perform solution generation and infeasibility detection. We further utilize LLM outputs as warm starts for downstream local search, providing a practical way to accelerate optimization even when the LLM outputs are imperfect. Experiments show that our fine-tuned model improves overall accuracy by up to 30\% over GPT-5.2; meanwhile LLM-guided warm starts provide up to speedup compared with starting from scratch in downstream local search.

Paper Structure

This paper contains 76 sections, 2 theorems, 38 equations, 5 figures, 10 tables, 2 algorithms.

Key Result

Theorem 1

Consider a problem graph $P=(V_P,E_P)$, a hardware graph $G=(V_G,E_G)$, and an explicitly listed family $\mathcal{C}$ of connected subsets of $V_G$, with $\{v\} \in \mathcal{C}$ for all $v \in V_G$. Deciding whether there exists a assignment satisfying constraints eq:feas-assign-main--eq:feas-adj-ma

Figures (5)

  • Figure 1: Framework.
  • Figure 2: Accuracy breakdown by the problem graph vertices number. Subplots show results for (a) the full dataset, (b) SAT instances only, and (c) UNSAT instances only.
  • Figure 3: Accuracy breakdown by instance size. Subplots show results for (a) the full dataset, (b) SAT instances only, and (c) UNSAT instances only.
  • Figure 4: Representative sparse hardware topologies used in quantum annealing: Chimera (left) and Pegasus (right). Pegasus has higher local connectivity than Chimera, but both remain sparse compared with many logical problem graphs. The figure was generated using the open-source dwave_networkx package from exact Chimera and Pegasus graph constructions dwave_networkx; Chimera is shown with size parameter $(m=2, n=2, t=4)$ and Pegasus with $m=4$.
  • Figure 5: Token usage for GPT-5.2 and Grok-4.1 across tasks. As instance size increases, these models often default to an "infeasible" verdict, bypassing the search for valid solutions to save tokens. In contrast, our fine-tuned model’s token usage scales linearly with problem size due to its structured format.

Theorems & Definitions (4)

  • Theorem 1
  • Proposition 1: Zero-phase infeasibility certificate
  • proof
  • proof