Discounted Pseudocosts in MILP

Krunal Kishor Patel

Discounted Pseudocosts in MILP

Krunal Kishor Patel

TL;DR

This work introduces discounted pseudocosts, a reinforcement-learning–inspired lookahead mechanism for MILP branching, by combining traditional pseudocosts with discountedfuture LP-relaxation gains through a discount factor $\gamma$. The method aims to approximate lookahead branching without the computational cost of full lookahead, and is integrated with reliability branching to maintain robustness. Experimental results on MIPLIB 2017 show small, sometimes tangible improvements in solving time and node counts for hard instances, but gains are not yet substantial overall, suggesting the approach merits further tuning and extension. The approach has the advantage of requiring no offline training and holds potential for broader application across MILP heuristics and related optimization domains.

Abstract

In this article, we introduce the concept of discounted pseudocosts, inspired by discounted total reward in reinforcement learning, and explore their application in mixed-integer linear programming (MILP). Traditional pseudocosts estimate changes in the objective function due to variable bound changes during the branch-and-bound process. By integrating reinforcement learning concepts, we propose a novel approach incorporating a forward-looking perspective into pseudocost estimation. We present the motivation behind discounted pseudocosts and discuss how they represent the anticipated reward for branching after one level of exploration in the MILP problem space. Initial experiments on MIPLIB 2017 benchmark instances demonstrate the potential of discounted pseudocosts to enhance branching strategies and accelerate the solution process for challenging MILP problems.

Discounted Pseudocosts in MILP

TL;DR

. The method aims to approximate lookahead branching without the computational cost of full lookahead, and is integrated with reliability branching to maintain robustness. Experimental results on MIPLIB 2017 show small, sometimes tangible improvements in solving time and node counts for hard instances, but gains are not yet substantial overall, suggesting the approach merits further tuning and extension. The approach has the advantage of requiring no offline training and holds potential for broader application across MILP heuristics and related optimization domains.

Abstract

Paper Structure (6 sections, 2 equations, 3 tables)

This paper contains 6 sections, 2 equations, 3 tables.

Introduction
Reinforcement Learning basics
Related work
Discounted pseudocosts
Computational results
Future work and Conclusion

Discounted Pseudocosts in MILP

TL;DR

Abstract

Discounted Pseudocosts in MILP

Authors

TL;DR

Abstract

Table of Contents