Table of Contents
Fetching ...

Rethinking the Capacity of Graph Neural Networks for Branching Strategy

Ziang Chen, Jialin Liu, Xiaohan Chen, Xinshang Wang, Wotao Yin

TL;DR

This work analyzes the capacity of graph neural networks to imitate the strong-branching heuristic in MILP solvers. It introduces the MP-tractable MILP class and proves that MP-GNNs can universally approximate SB within this class, while demonstrating that MP-GNNs cannot do so universally. To overcome this, the authors show that 2-FGNNs, with higher-order interactions, can universally represent SB across all MILPs, albeit with higher computational cost. The results provide a theoretical foundation for using GNNs to mimic SB, offer practical guidance on when to employ MP-GNNs versus 2-FGNNs, and are supported by numerical experiments validating the theory on both MP-tractable and non-MP-tractable instances.

Abstract

Graph neural networks (GNNs) have been widely used to predict properties and heuristics of mixed-integer linear programs (MILPs) and hence accelerate MILP solvers. This paper investigates the capacity of GNNs to represent strong branching (SB), the most effective yet computationally expensive heuristic employed in the branch-and-bound algorithm. In the literature, message-passing GNN (MP-GNN), as the simplest GNN structure, is frequently used as a fast approximation of SB and we find that not all MILPs's SB can be represented with MP-GNN. We precisely define a class of "MP-tractable" MILPs for which MP-GNNs can accurately approximate SB scores. Particularly, we establish a universal approximation theorem: for any data distribution over the MP-tractable class, there always exists an MP-GNN that can approximate the SB score with arbitrarily high accuracy and arbitrarily high probability, which lays a theoretical foundation of the existing works on imitating SB with MP-GNN. For MILPs without the MP-tractability, unfortunately, a similar result is impossible, which can be illustrated by two MILP instances with different SB scores that cannot be distinguished by any MP-GNN, regardless of the number of parameters. Recognizing this, we explore another GNN structure called the second-order folklore GNN (2-FGNN) that overcomes this limitation, and the aforementioned universal approximation theorem can be extended to the entire MILP space using 2-FGNN, regardless of the MP-tractability. A small-scale numerical experiment is conducted to directly validate our theoretical findings.

Rethinking the Capacity of Graph Neural Networks for Branching Strategy

TL;DR

This work analyzes the capacity of graph neural networks to imitate the strong-branching heuristic in MILP solvers. It introduces the MP-tractable MILP class and proves that MP-GNNs can universally approximate SB within this class, while demonstrating that MP-GNNs cannot do so universally. To overcome this, the authors show that 2-FGNNs, with higher-order interactions, can universally represent SB across all MILPs, albeit with higher computational cost. The results provide a theoretical foundation for using GNNs to mimic SB, offer practical guidance on when to employ MP-GNNs versus 2-FGNNs, and are supported by numerical experiments validating the theory on both MP-tractable and non-MP-tractable instances.

Abstract

Graph neural networks (GNNs) have been widely used to predict properties and heuristics of mixed-integer linear programs (MILPs) and hence accelerate MILP solvers. This paper investigates the capacity of GNNs to represent strong branching (SB), the most effective yet computationally expensive heuristic employed in the branch-and-bound algorithm. In the literature, message-passing GNN (MP-GNN), as the simplest GNN structure, is frequently used as a fast approximation of SB and we find that not all MILPs's SB can be represented with MP-GNN. We precisely define a class of "MP-tractable" MILPs for which MP-GNNs can accurately approximate SB scores. Particularly, we establish a universal approximation theorem: for any data distribution over the MP-tractable class, there always exists an MP-GNN that can approximate the SB score with arbitrarily high accuracy and arbitrarily high probability, which lays a theoretical foundation of the existing works on imitating SB with MP-GNN. For MILPs without the MP-tractability, unfortunately, a similar result is impossible, which can be illustrated by two MILP instances with different SB scores that cannot be distinguished by any MP-GNN, regardless of the number of parameters. Recognizing this, we explore another GNN structure called the second-order folklore GNN (2-FGNN) that overcomes this limitation, and the aforementioned universal approximation theorem can be extended to the entire MILP space using 2-FGNN, regardless of the MP-tractability. A small-scale numerical experiment is conducted to directly validate our theoretical findings.
Paper Structure (43 sections, 18 theorems, 66 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 43 sections, 18 theorems, 66 equations, 3 figures, 1 table, 2 algorithms.

Key Result

Theorem 4.1

For any $G\in\mathcal{G}_{m,n}$, the vertex partition induced by Algorithm alg:WL (if no collision) will converge within $\mathcal{O}(m+n)$ iterations to a partition $(\mathcal{I},\mathcal{J})$, where $\mathcal{I}=\{I_1,I_2,\dots,I_s\}$ is a partition of $\{1,2,\dots,m\}$ and $\mathcal{J} = \{J_1,J_

Figures (3)

  • Figure 1: An illustrative example of MILP and its graph representation.
  • Figure 2: An illustrative example of color refinement and partitions. Initially, all variables share a common color due to their identical node attributes, as do the constraint nodes. After a round of the WL test, $x_1$ and $x_2$ retain their shared color, while $x_3$ is assigned a distinct color, as it connects solely to the first constraint, unlike $x_1$ and $x_2$. Similarly, the colors of the two constraints can also be differentiated. Finally, this partition stabilizes, resulting in $\mathcal{I} = \{\{1\},\{2\}\}$, $\mathcal{J} = \{\{1,2\},\{3\}\}$.
  • Figure 3: Numerical results of MP-GNN and 2-FGNN for SB score fitting. In the right figure, the training error of MP-GNN on MP-intractable examples does not decrease after however many epochs.

Theorems & Definitions (38)

  • Definition 2.1: Space of MILP-graphs
  • Definition 2.2: Space of MP-GNNs
  • Definition 2.3
  • Definition 3.1: LP relaxation with a single bound change
  • Definition 3.2: Strong branching scores
  • Theorem 4.1: chen2022representing-lp*Theorem A.2
  • Definition 4.2: Message-passing-tractability
  • Theorem 4.4
  • Theorem 4.5
  • Corollary 4.6
  • ...and 28 more