Table of Contents
Fetching ...

Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming

Siyuan Li, Yifan Yu, Zhihao Zhang, Mengjing Chen, Fangzhou Zhu, Tao Zhong, Peng Liu, Jianye Hao

TL;DR

Collab-Solver introduces a Stackelberg-based, multi-agent framework to jointly learn policies for cut selection and branching in MILP solvers. It employs a two-phase learning approach with data-communicated pretraining and concurrent two-timescale finetuning to stabilize collaboration between the modules. Empirical results on six benchmark MILP problems and long-horizon datasets show substantial improvements in solving time and optimism of the primal-dual gap compared to single-module learners and hyperparameter-tuning baselines, with strong generalization to unseen instances. The work demonstrates that coordinated policy learning among tightly coupled solver components can outperform isolated optimization and points toward extending collaboration to additional solver modules for broader industrial impact.

Abstract

Mixed-integer linear programming (MILP) has been a fundamental problem in combinatorial optimization. Conventional MILP solving mainly relies on carefully designed heuristics embedded in the branch-and-bound framework. Driven by the strong capabilities of neural networks, recent research is exploring the value of machine learning alongside conventional MILP solving. Although learning-based MILP methods have shown great promise, existing works typically learn policies for individual modules in MILP solvers in isolation, without considering their interdependence, which limits both solving efficiency and solution quality. To address this limitation, we propose Collab-Solver, a novel multi-agent-based policy learning framework for MILP that enables collaborative policy optimization for multiple modules. Specifically, we formulate the collaboration between cut selection and branching in MILP solving as a Stackelberg game. Under this formulation, we develop a two-phase learning paradigm to stabilize collaborative policy learning: the first phase performs data-communicated policy pretraining, and the second phase further orchestrates the policy learning for various modules. Extensive experiments on both synthetic and large-scale real-world MILP datasets demonstrate that the jointly learned policies significantly improve solving performance. Moreover, the policies learned by Collab-Solver have also demonstrated excellent generalization abilities across different instance sets.

Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming

TL;DR

Collab-Solver introduces a Stackelberg-based, multi-agent framework to jointly learn policies for cut selection and branching in MILP solvers. It employs a two-phase learning approach with data-communicated pretraining and concurrent two-timescale finetuning to stabilize collaboration between the modules. Empirical results on six benchmark MILP problems and long-horizon datasets show substantial improvements in solving time and optimism of the primal-dual gap compared to single-module learners and hyperparameter-tuning baselines, with strong generalization to unseen instances. The work demonstrates that coordinated policy learning among tightly coupled solver components can outperform isolated optimization and points toward extending collaboration to additional solver modules for broader industrial impact.

Abstract

Mixed-integer linear programming (MILP) has been a fundamental problem in combinatorial optimization. Conventional MILP solving mainly relies on carefully designed heuristics embedded in the branch-and-bound framework. Driven by the strong capabilities of neural networks, recent research is exploring the value of machine learning alongside conventional MILP solving. Although learning-based MILP methods have shown great promise, existing works typically learn policies for individual modules in MILP solvers in isolation, without considering their interdependence, which limits both solving efficiency and solution quality. To address this limitation, we propose Collab-Solver, a novel multi-agent-based policy learning framework for MILP that enables collaborative policy optimization for multiple modules. Specifically, we formulate the collaboration between cut selection and branching in MILP solving as a Stackelberg game. Under this formulation, we develop a two-phase learning paradigm to stabilize collaborative policy learning: the first phase performs data-communicated policy pretraining, and the second phase further orchestrates the policy learning for various modules. Extensive experiments on both synthetic and large-scale real-world MILP datasets demonstrate that the jointly learned policies significantly improve solving performance. Moreover, the policies learned by Collab-Solver have also demonstrated excellent generalization abilities across different instance sets.

Paper Structure

This paper contains 32 sections, 15 equations, 5 figures, 10 tables, 2 algorithms.

Figures (5)

  • Figure 1: Flowchart of the main MILP solving loop, which involves multiple closely related modules. In this work, we investigate the collaboration between cutting planes and branching, which exhibits an upstream-downstream interaction relationship in the solving loop.
  • Figure 2: The Collab-Solver framework. The upper part illustrates the first learning phase: data communicated pretraining, and the lower part describes the second learning phase: concurrent joint finetuning.
  • Figure 3: Network structure of $\pi_c$, which is composed of LSTM encoder, GCNN encoder, MLP, and pointer network.
  • Figure 4: Comparing the branching policy pretraining processes with data communication and without data communication.
  • Figure 5: The hyperparameter study results. The mean and standard deviation of solving time and PD integral under different hyperparameter settings are shown above.