Table of Contents
Fetching ...

GFORS: GPU-Accelerated First-Order Method with Randomized Sampling for Binary Integer Programs

Ningji Wei, Jiaming Liang

TL;DR

GFORS introduces a GPU-accelerated framework for large binary integer programs by coupling a PDHG-style first-order method on a continuous relaxation with a randomized, feasibility-aware sampling module. The method operates end-to-end on GPUs, yielding near-stationarity guarantees for the first-order component and probabilistic bounds on sampled solutions, without global optimality certificates. It enhances sampling with techniques such as total-unimodular reformulation, customized sampling, and monotone relaxation, and demonstrates competitive performance on large-scale instances where traditional solvers struggle within tight time limits. Overall, GFORS serves as a scalable, GPU-native complement to exact solvers, delivering fast, high-quality incumbents when problem size and response time are the primary constraints.

Abstract

We present GFORS, a GPU-accelerated framework for large binary integer programs. It couples a first-order (PDHG-style) routine that guides the search in the continuous relaxation with a randomized, feasibility-aware sampling module that generates batched binary candidates. Both components are designed to run end-to-end on GPUs with minimal CPU-GPU synchronization. The framework establishes near-stationary-point guarantees for the first-order routine and probabilistic bounds on the feasibility and quality of sampled solutions, while not providing global optimality certificates. To improve sampling effectiveness, we introduce techniques such as total-unimodular reformulation, customized sampling design, and monotone relaxation. On classic benchmarks (set cover, knapsack, max cut, 3D assignment, facility location), baseline state-of-the-art exact solvers remain stronger on small-medium instances, while GFORS attains high-quality incumbents within seconds; on large instances, GFORS yields substantially shorter runtimes, with solution quality often comparable to -- or better than -- the baseline under the same time limit. These results suggest that GFORS can complement exact solvers by delivering scalable, GPU-native search when problem size and response time are the primary constraints.

GFORS: GPU-Accelerated First-Order Method with Randomized Sampling for Binary Integer Programs

TL;DR

GFORS introduces a GPU-accelerated framework for large binary integer programs by coupling a PDHG-style first-order method on a continuous relaxation with a randomized, feasibility-aware sampling module. The method operates end-to-end on GPUs, yielding near-stationarity guarantees for the first-order component and probabilistic bounds on sampled solutions, without global optimality certificates. It enhances sampling with techniques such as total-unimodular reformulation, customized sampling, and monotone relaxation, and demonstrates competitive performance on large-scale instances where traditional solvers struggle within tight time limits. Overall, GFORS serves as a scalable, GPU-native complement to exact solvers, delivering fast, high-quality incumbents when problem size and response time are the primary constraints.

Abstract

We present GFORS, a GPU-accelerated framework for large binary integer programs. It couples a first-order (PDHG-style) routine that guides the search in the continuous relaxation with a randomized, feasibility-aware sampling module that generates batched binary candidates. Both components are designed to run end-to-end on GPUs with minimal CPU-GPU synchronization. The framework establishes near-stationary-point guarantees for the first-order routine and probabilistic bounds on the feasibility and quality of sampled solutions, while not providing global optimality certificates. To improve sampling effectiveness, we introduce techniques such as total-unimodular reformulation, customized sampling design, and monotone relaxation. On classic benchmarks (set cover, knapsack, max cut, 3D assignment, facility location), baseline state-of-the-art exact solvers remain stronger on small-medium instances, while GFORS attains high-quality incumbents within seconds; on large instances, GFORS yields substantially shorter runtimes, with solution quality often comparable to -- or better than -- the baseline under the same time limit. These results suggest that GFORS can complement exact solvers by delivering scalable, GPU-native search when problem size and response time are the primary constraints.

Paper Structure

This paper contains 32 sections, 8 theorems, 68 equations, 1 figure, 7 tables, 4 algorithms.

Key Result

Proposition 1

Assume ${\cal \hat{L}}(\cdot,y_k)$ is convex, $\tau_1 \tau_2 \|K\|^2 \le 1/4$, and $\tau_1 \le 1/(4L)$. Define Then, the following statements hold:

Figures (1)

  • Figure 1: Performance vs. NNZ. Solved percentages and time-to-target values are aggregated by $\lfloor \log_{2}(\mathrm{nnz}) \rfloor$. Because Gurobi may spend long runtimes refining incumbents while already having strong early solutions, we adopt time-to-target as a fair measure: for GFORS, this is the time to its best solution before halting; for Gurobi (with/without presolve), it is the first time a solution improves upon the GFORS optimum, or $1{,}800$ seconds if none is found.

Theorems & Definitions (17)

  • Proposition 1
  • proof
  • Theorem 1
  • proof
  • Proposition 2
  • proof
  • Theorem 2
  • proof
  • Proposition 3
  • proof
  • ...and 7 more