Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming

Siyuan Li; Yifan Yu; Zhihao Zhang; Mengjing Chen; Fangzhou Zhu; Tao Zhong; Peng Liu; Jianye Hao

Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming

Siyuan Li, Yifan Yu, Zhihao Zhang, Mengjing Chen, Fangzhou Zhu, Tao Zhong, Peng Liu, Jianye Hao

TL;DR

Collab-Solver introduces a Stackelberg-based, multi-agent framework to jointly learn policies for cut selection and branching in MILP solvers. It employs a two-phase learning approach with data-communicated pretraining and concurrent two-timescale finetuning to stabilize collaboration between the modules. Empirical results on six benchmark MILP problems and long-horizon datasets show substantial improvements in solving time and optimism of the primal-dual gap compared to single-module learners and hyperparameter-tuning baselines, with strong generalization to unseen instances. The work demonstrates that coordinated policy learning among tightly coupled solver components can outperform isolated optimization and points toward extending collaboration to additional solver modules for broader industrial impact.

Abstract

Mixed-integer linear programming (MILP) has been a fundamental problem in combinatorial optimization. Conventional MILP solving mainly relies on carefully designed heuristics embedded in the branch-and-bound framework. Driven by the strong capabilities of neural networks, recent research is exploring the value of machine learning alongside conventional MILP solving. Although learning-based MILP methods have shown great promise, existing works typically learn policies for individual modules in MILP solvers in isolation, without considering their interdependence, which limits both solving efficiency and solution quality. To address this limitation, we propose Collab-Solver, a novel multi-agent-based policy learning framework for MILP that enables collaborative policy optimization for multiple modules. Specifically, we formulate the collaboration between cut selection and branching in MILP solving as a Stackelberg game. Under this formulation, we develop a two-phase learning paradigm to stabilize collaborative policy learning: the first phase performs data-communicated policy pretraining, and the second phase further orchestrates the policy learning for various modules. Extensive experiments on both synthetic and large-scale real-world MILP datasets demonstrate that the jointly learned policies significantly improve solving performance. Moreover, the policies learned by Collab-Solver have also demonstrated excellent generalization abilities across different instance sets.

Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming

TL;DR

Abstract

Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)