Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

Joshua F. Cooper; Seung Jin Choi; I. Esra Buyuktahtakin

Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

Joshua F. Cooper, Seung Jin Choi, I. Esra Buyuktahtakin

TL;DR

The study addresses solving the Capacitated Lot Sizing Problem ($CLSP$), a challenging mixed-integer program, by predicting the binary production decisions with an encoder-decoder transformer. After inference, the predictions fix the binary variables, transforming the MIP into a linear program ($LP$) and yielding a polynomial-time approximation for this combinatorial problem. On a large synthetic dataset (up to 1.2 million instances, with 240k test instances), the post-processed transformer achieves 0% infeasibility and 0% optimality gap, while reducing CPLEX/Gurobi solve times by over 99%. Compared with an LSTM baseline, the transformer demonstrates superior speed and solution quality, especially when using an End-of-Sequence token, highlighting the practical potential of transformer-based learned heuristics for dynamic MIPs in operations research.

Abstract

In this study, we introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs, specifically focusing on the Capacitated Lot Sizing Problem (CLSP). Our approach, to our knowledge, is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. Specifically, our approach harnesses the encoder decoder transformer's ability to process sequential data, making it well-suited for predicting binary variables indicating production setup decisions in each period of the CLSP. This problem is inherently dynamic, and we need to handle sequential decision making under constraints. We present an efficient algorithm in which CLSP solutions are learned through a transformer neural network. The proposed post-processed transformer algorithm surpasses the state-of-the-art solver, CPLEX and Long Short-Term Memory (LSTM) in solution time, optimal gap, and percent infeasibility over 240K benchmark CLSP instances tested. After the ML model is trained, conducting inference on the model, reduces the MIP into a linear program (LP). This transforms the ML-based algorithm, combined with an LP solver, into a polynomial-time approximation algorithm to solve a well-known NP-Hard problem, with almost perfect solution quality.

Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

TL;DR

The study addresses solving the Capacitated Lot Sizing Problem (

), a challenging mixed-integer program, by predicting the binary production decisions with an encoder-decoder transformer. After inference, the predictions fix the binary variables, transforming the MIP into a linear program (

) and yielding a polynomial-time approximation for this combinatorial problem. On a large synthetic dataset (up to 1.2 million instances, with 240k test instances), the post-processed transformer achieves 0% infeasibility and 0% optimality gap, while reducing CPLEX/Gurobi solve times by over 99%. Compared with an LSTM baseline, the transformer demonstrates superior speed and solution quality, especially when using an End-of-Sequence token, highlighting the practical potential of transformer-based learned heuristics for dynamic MIPs in operations research.

Abstract

Paper Structure (14 sections, 1 equation, 1 figure, 5 tables)

This paper contains 14 sections, 1 equation, 1 figure, 5 tables.

Introduction
Methodology
CLSP: Mixed Integer Program Formulation
Design of the Transformer Model
Data and Preprocessing
Post-Processing Transformer Predictions
Results
Model Performance
Comparative Analysis
Significance of Research Results
Discussion and Future Work
Implications for Operations Research
Limitations
Future Work

Figures (1)

Figure 1: Post-processing schema employed to remove infeasibility in predictions

Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

TL;DR

Abstract

Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

Authors

TL;DR

Abstract

Table of Contents

Figures (1)