Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs
Caleb Ju, Serif Yesil, Mengyuan Sun, Chandra Chekuri, Edgar Solomonik
TL;DR
This work develops a practical, parallel solver for positive linear programs based on the multiplicative weight update (MWU) method and demonstrates its application to key graph problems such as densest subgraph, bipartite matching, vertex cover, and dominating set. Central to the approach are a step-size line search strategy, implicit matrix representations, and a 2D distributed-memory layout that enable high performance on large-scale graphs. Empirical results on the Stampede2 supercomputer show MWU-based solvers can outperform general LP solvers (CPLEX, Gurobi) and, in some cases, rival specialized parallel graph algorithms, achieving substantial speedups and strong scalability. The combination of algorithmic innovations and software optimizations provides a viable, generalizable alternative for approximate positive-LP-based graph optimization at scale.
Abstract
Positive linear programs (LPs) model many graph and operations research problems. One can solve for a $(1+ε)$-approximation for positive LPs, for any selected $ε$, in polylogarithmic depth and near-linear work via variations of the multiplicative weight update (MWU) method. Despite extensive theoretical work on these algorithms through the decades, their empirical performance is not well understood. In this work, we implement and test an efficient parallel algorithm for solving positive LP relaxations, and apply it to graph problems such as densest subgraph, bipartite matching, vertex cover and dominating set. We accelerate the algorithm via a new step size search heuristic. Our implementation uses sparse linear algebra optimization techniques such as fusion of vector operations and use of sparse format. Furthermore, we devise an implicit representation for graph incidence constraints. We demonstrate the parallel scalability with the use of threading OpenMP and MPI on the Stampede2 supercomputer. We compare this implementation with exact libraries and specialized libraries for the above problems in order to evaluate MWU's practical standing for both accuracy and performance among other methods. Our results show this implementation is faster than general purpose LP solvers (IBM CPLEX, Gurobi) in all of our experiments, and in some instances, outperforms state-of-the-art specialized parallel graph algorithms.
