A Reinforcement-Learning-Based Multiple-Column Selection Strategy for Column Generation

Haofeng Yuan; Lichang Fang; Shiji Song

A Reinforcement-Learning-Based Multiple-Column Selection Strategy for Column Generation

Haofeng Yuan, Lichang Fang, Shiji Song

TL;DR

This work tackles slow convergence in column generation for large-scale LPs by introducing a reinforcement-learning-based strategy to select multiple columns per CG iteration. It formulates the CG process as an MDP, employing an actor-critic network with a GIN-based encoder and a complete-graph enhanced actor to learn nontrivial, column-relation-aware multi-column selections under PPO. Empirical results on cutting stock and graph coloring problems show the RL approach reduces the number of iterations and runtime compared to both single-column and existing multi-column baselines, and generalizes to larger instances. The work advances practical CG performance and offers a framework that can be extended to incorporate additional acceleration techniques.

Abstract

Column generation (CG) is one of the most successful approaches for solving large-scale linear programming (LP) problems. Given an LP with a prohibitively large number of variables (i.e., columns), the idea of CG is to explicitly consider only a subset of columns and iteratively add potential columns to improve the objective value. While adding the column with the most negative reduced cost can guarantee the convergence of CG, it has been shown that adding multiple columns per iteration rather than a single column can lead to faster convergence. However, it remains a challenge to design a multiple-column selection strategy to select the most promising columns from a large number of candidate columns. In this paper, we propose a novel reinforcement-learning-based (RL) multiple-column selection strategy. To the best of our knowledge, it is the first RL-based multiple-column selection strategy for CG. The effectiveness of our approach is evaluated on two sets of problems: the cutting stock problem and the graph coloring problem. Compared to several widely used single-column and multiple-column selection strategies, our RL-based multiple-column selection strategy leads to faster convergence and achieves remarkable reductions in the number of CG iterations and runtime.

A Reinforcement-Learning-Based Multiple-Column Selection Strategy for Column Generation

TL;DR

Abstract

Paper Structure (28 sections, 10 equations, 4 figures, 5 tables)

This paper contains 28 sections, 10 equations, 4 figures, 5 tables.

Introduction
Related Work
Acceleration Methods for Column Generation.
Machine-Learning-based Column Selection Strategy.
Basis of Column Generation
Methodology
MDP Formulations
State $\mathcal{S}$.
Action $\mathcal{A}$.
Transition $\mathcal{T}$.
Reward $\mathcal{R}$.
Model
Encoder.
Critic Decoder.
Actor Decoder.
...and 13 more sections

Figures (4)

Figure 1: The iterative process of CG.
Figure 2: A toy example of state. The left part illustrates the bipartite graph representation of the current RMP, including 4 constraint nodes, 3 existing column nodes, and 4 candidate column nodes. The right part denotes the global feature vector for the problem instance.
Figure 3: The architecture of the encoder. Colored nodes denote feature vectors or embeddings.
Figure 4: The architecture of the critic decoder and actor decoder. Here shows a toy example of selecting 2 from 4 candidates.

A Reinforcement-Learning-Based Multiple-Column Selection Strategy for Column Generation

TL;DR

Abstract

A Reinforcement-Learning-Based Multiple-Column Selection Strategy for Column Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)