Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

Caleb Ju; Serif Yesil; Mengyuan Sun; Chandra Chekuri; Edgar Solomonik

Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

Caleb Ju, Serif Yesil, Mengyuan Sun, Chandra Chekuri, Edgar Solomonik

TL;DR

This work develops a practical, parallel solver for positive linear programs based on the multiplicative weight update (MWU) method and demonstrates its application to key graph problems such as densest subgraph, bipartite matching, vertex cover, and dominating set. Central to the approach are a step-size line search strategy, implicit matrix representations, and a 2D distributed-memory layout that enable high performance on large-scale graphs. Empirical results on the Stampede2 supercomputer show MWU-based solvers can outperform general LP solvers (CPLEX, Gurobi) and, in some cases, rival specialized parallel graph algorithms, achieving substantial speedups and strong scalability. The combination of algorithmic innovations and software optimizations provides a viable, generalizable alternative for approximate positive-LP-based graph optimization at scale.

Abstract

Positive linear programs (LPs) model many graph and operations research problems. One can solve for a $(1+ε)$-approximation for positive LPs, for any selected $ε$, in polylogarithmic depth and near-linear work via variations of the multiplicative weight update (MWU) method. Despite extensive theoretical work on these algorithms through the decades, their empirical performance is not well understood. In this work, we implement and test an efficient parallel algorithm for solving positive LP relaxations, and apply it to graph problems such as densest subgraph, bipartite matching, vertex cover and dominating set. We accelerate the algorithm via a new step size search heuristic. Our implementation uses sparse linear algebra optimization techniques such as fusion of vector operations and use of sparse format. Furthermore, we devise an implicit representation for graph incidence constraints. We demonstrate the parallel scalability with the use of threading OpenMP and MPI on the Stampede2 supercomputer. We compare this implementation with exact libraries and specialized libraries for the above problems in order to evaluate MWU's practical standing for both accuracy and performance among other methods. Our results show this implementation is faster than general purpose LP solvers (IBM CPLEX, Gurobi) in all of our experiments, and in some instances, outperforms state-of-the-art specialized parallel graph algorithms.

Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

TL;DR

Abstract

Positive linear programs (LPs) model many graph and operations research problems. One can solve for a

-approximation for positive LPs, for any selected

, in polylogarithmic depth and near-linear work via variations of the multiplicative weight update (MWU) method. Despite extensive theoretical work on these algorithms through the decades, their empirical performance is not well understood. In this work, we implement and test an efficient parallel algorithm for solving positive LP relaxations, and apply it to graph problems such as densest subgraph, bipartite matching, vertex cover and dominating set. We accelerate the algorithm via a new step size search heuristic. Our implementation uses sparse linear algebra optimization techniques such as fusion of vector operations and use of sparse format. Furthermore, we devise an implicit representation for graph incidence constraints. We demonstrate the parallel scalability with the use of threading OpenMP and MPI on the Stampede2 supercomputer. We compare this implementation with exact libraries and specialized libraries for the above problems in order to evaluate MWU's practical standing for both accuracy and performance among other methods. Our results show this implementation is faster than general purpose LP solvers (IBM CPLEX, Gurobi) in all of our experiments, and in some instances, outperforms state-of-the-art specialized parallel graph algorithms.

Paper Structure (31 sections, 2 theorems, 27 equations, 5 figures, 4 tables, 3 algorithms)

This paper contains 31 sections, 2 theorems, 27 equations, 5 figures, 4 tables, 3 algorithms.

Introduction
Background on Linear Program Solvers
Positive LP Solvers
The MWU Algorithm
Graph Problems as Positive LPs
Heuristics to Improve Convergence in Practice MWU with Line Search
Line Search as a Constrained Optimization Problem
Implementing Line Search
Software Optimizations and Parallelization
Shared-Memory Optimizations
Choice of Matrix Format
Implicit Representations
Loop Fusion and Vectorization Opportunities
Distributed Parallelization
Experimental Setup
...and 16 more sections

Key Result

Theorem 4.1

MWU with line search (Algorithm alg:exp2) either returns an $(1+\epsilon)$-relative approximate solution, i.e., an $x \geq \mathbb{0}$ such that $\bm{Px} \leq (1+\epsilon) \mathbb{1}$ and $\bm{Cx} \geq \mathbb{1}$, or correctly reports the LP is infeasible. The number of iterations is at most $\tild

Figures (5)

Figure 1: Four graph problems run on the same graph. Variables of LP with a nonzero value are highlighted in red. An example of matching is given in Figure \ref{['fig:matching_vis']}. The set of edges in the matching are marked with thick red lines.
Figure 2: CSB representation and its SpMV operation.
Figure 3: Max violation, defined as $\max\{0,\max(\bm{Px})-1,1-\min(\bm{Cx})\}$ for MPCSolver, which is a gradient descent algorithm with adaptive error makari2013distributed, and MWU (Algorithm \ref{['alg:exp2']}) with standard step size and Newton's method.
Figure 4: Scalability of MWU-opt and MWU-PETSc (see subfigure (j) for the legend). Axes are in $log_2$ scale. All values are normalized to single-threaded execution of MWU-opt. We omit the results for MWU-PETSc, if the execution time is slower than single-threaded execution of MWU-opt.
Figure 5: Breakdown of execution times and speedups obtained with MWU-opt for different components.

Theorems & Definitions (2)

Theorem 4.1
Proposition 4.2

Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

TL;DR

Abstract

Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (2)