Table of Contents
Fetching ...

A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization

Shengyu Feng, Weiwei Sun, Shanda Li, Ameet Talwalkar, Yiming Yang

TL;DR

FrontierCO introduces a comprehensive, realistic benchmark for evaluating contemporary ML-based solvers on eight CO problems using large-scale, industry-inspired instances and standardized training data. The study systematically compares 16 ML solvers (neural and LLM-based) against state-of-the-art human-designed solvers under a fixed time budget, revealing a persistent performance gap, especially on hard instances, with neural solvers showing scalability limits and LLM agents displaying high variability. Ablation and analysis show neural modules help on weaker baselines, while LLMs tend to rediscover known metaheuristics rather than invent new strategies, suggesting hybrid neural-symbolic approaches as a promising direction. By providing standardized BKS, training data, and a unified evaluation framework, FrontierCO offers a reproducible platform that guides robust advancement in ML for combinatorial optimization and informs practical deployment in real-world settings.

Abstract

Machine learning (ML) has demonstrated considerable potential in supporting model design and optimization for combinatorial optimization (CO) problems. However, much of the progress to date has been evaluated on small-scale, synthetic datasets, raising concerns about the practical effectiveness of ML-based solvers in real-world, large-scale CO scenarios. Additionally, many existing CO benchmarks lack sufficient training data, limiting their utility for evaluating data-driven approaches. To address these limitations, we introduce FrontierCO, a comprehensive benchmark that covers eight canonical CO problem types and evaluates 16 representative ML-based solvers--including graph neural networks and large language model (LLM) agents. FrontierCO features challenging instances drawn from industrial applications and frontier CO research, offering both realistic problem difficulty and abundant training data. Our empirical results provide critical insights into the strengths and limitations of current ML methods, helping to guide more robust and practically relevant advances at the intersection of machine learning and combinatorial optimization. Our data is available at https://huggingface.co/datasets/CO-Bench/FrontierCO.

A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization

TL;DR

FrontierCO introduces a comprehensive, realistic benchmark for evaluating contemporary ML-based solvers on eight CO problems using large-scale, industry-inspired instances and standardized training data. The study systematically compares 16 ML solvers (neural and LLM-based) against state-of-the-art human-designed solvers under a fixed time budget, revealing a persistent performance gap, especially on hard instances, with neural solvers showing scalability limits and LLM agents displaying high variability. Ablation and analysis show neural modules help on weaker baselines, while LLMs tend to rediscover known metaheuristics rather than invent new strategies, suggesting hybrid neural-symbolic approaches as a promising direction. By providing standardized BKS, training data, and a unified evaluation framework, FrontierCO offers a reproducible platform that guides robust advancement in ML for combinatorial optimization and informs practical deployment in real-world settings.

Abstract

Machine learning (ML) has demonstrated considerable potential in supporting model design and optimization for combinatorial optimization (CO) problems. However, much of the progress to date has been evaluated on small-scale, synthetic datasets, raising concerns about the practical effectiveness of ML-based solvers in real-world, large-scale CO scenarios. Additionally, many existing CO benchmarks lack sufficient training data, limiting their utility for evaluating data-driven approaches. To address these limitations, we introduce FrontierCO, a comprehensive benchmark that covers eight canonical CO problem types and evaluates 16 representative ML-based solvers--including graph neural networks and large language model (LLM) agents. FrontierCO features challenging instances drawn from industrial applications and frontier CO research, offering both realistic problem difficulty and abundant training data. Our empirical results provide critical insights into the strengths and limitations of current ML methods, helping to guide more robust and practically relevant advances at the intersection of machine learning and combinatorial optimization. Our data is available at https://huggingface.co/datasets/CO-Bench/FrontierCO.

Paper Structure

This paper contains 31 sections, 3 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Overview of FrontierCO.
  • Figure 2: Gap (%) comparison of classical and ML-based solvers across eight CO problems with easy and hard test instances (lower is better). The classical solvers are in deep blue, neural solvers are colored in green, and LLM agentic solvers are represented by redish colors.
  • Figure 3: Solving time (in seconds) for the SOTA classical solvers on eight CO problems with easy and hard test sets, respectively.
  • Figure 4: Training dynamics of neural solvers on Euclidean and non-Euclidean STP instances.
  • Figure 5: Word cloud of the algorithms generated by LLM-based solvers.