Rethinking the Capacity of Graph Neural Networks for Branching Strategy
Ziang Chen, Jialin Liu, Xiaohan Chen, Xinshang Wang, Wotao Yin
TL;DR
This work analyzes the capacity of graph neural networks to imitate the strong-branching heuristic in MILP solvers. It introduces the MP-tractable MILP class and proves that MP-GNNs can universally approximate SB within this class, while demonstrating that MP-GNNs cannot do so universally. To overcome this, the authors show that 2-FGNNs, with higher-order interactions, can universally represent SB across all MILPs, albeit with higher computational cost. The results provide a theoretical foundation for using GNNs to mimic SB, offer practical guidance on when to employ MP-GNNs versus 2-FGNNs, and are supported by numerical experiments validating the theory on both MP-tractable and non-MP-tractable instances.
Abstract
Graph neural networks (GNNs) have been widely used to predict properties and heuristics of mixed-integer linear programs (MILPs) and hence accelerate MILP solvers. This paper investigates the capacity of GNNs to represent strong branching (SB), the most effective yet computationally expensive heuristic employed in the branch-and-bound algorithm. In the literature, message-passing GNN (MP-GNN), as the simplest GNN structure, is frequently used as a fast approximation of SB and we find that not all MILPs's SB can be represented with MP-GNN. We precisely define a class of "MP-tractable" MILPs for which MP-GNNs can accurately approximate SB scores. Particularly, we establish a universal approximation theorem: for any data distribution over the MP-tractable class, there always exists an MP-GNN that can approximate the SB score with arbitrarily high accuracy and arbitrarily high probability, which lays a theoretical foundation of the existing works on imitating SB with MP-GNN. For MILPs without the MP-tractability, unfortunately, a similar result is impossible, which can be illustrated by two MILP instances with different SB scores that cannot be distinguished by any MP-GNN, regardless of the number of parameters. Recognizing this, we explore another GNN structure called the second-order folklore GNN (2-FGNN) that overcomes this limitation, and the aforementioned universal approximation theorem can be extended to the entire MILP space using 2-FGNN, regardless of the MP-tractability. A small-scale numerical experiment is conducted to directly validate our theoretical findings.
