Parallel and (Nearly) Work-Efficient Dynamic Programming
Xiangyun Ding, Yan Gu, Yihan Sun
TL;DR
This work addresses the challenge of achieving nearly work-efficient parallel dynamic programming by introducing the Cordon Algorithm, a framework that identifies a frontier of ready DP states and processes them in parallel while preserving correctness. By applying this framework to problems such as LIS, LCS, convex/concave GLWS, OAT, and GAP, the authors obtain near-optimal work with nontrivial parallelism and, in several cases, polylogarithmic span. The results unify and extend optimized sequential DP techniques (e.g., decision monotonicity) into parallel settings, delivering both theoretical bounds and practical proofs-of-concept. Experimental results on large-scale instances demonstrate substantial speedups over optimized sequential implementations, underscoring the framework’s practical impact for real-world DP problems.
Abstract
The idea of dynamic programming (DP), proposed by Bellman in the 1950s, is one of the most important algorithmic techniques. However, in parallel, many fundamental and sequentially simple problems become more challenging, and open to a (nearly) work-efficient solution (i.e., the work is off by at most a polylogarithmic factor over the best sequential solution). In fact, sequential DP algorithms employ many advanced optimizations such as decision monotonicity or special data structures, and achieve better work than straightforward solutions. Many such optimizations are inherently sequential, which creates extra challenges for a parallel algorithm to achieve the same work bound. The goal of this paper is to achieve (nearly) work-efficient parallel DP algorithms by parallelizing classic, highly-optimized and practical sequential algorithms. We show a general framework called the Cordon Algorithm for parallel DP algorithms, and use it to solve several classic problems. Our selection of problems includes Longest Increasing Subsequence (LIS), sparse Longest Common Subsequence (LCS), convex/concave generalized Least Weight Subsequence (LWS), Optimal Alphabetic Tree (OAT), and more. We show how the Cordon Algorithm can be used to achieve the same level of optimization as the sequential algorithms, and achieve good parallelism. Many of our algorithms are conceptually simple, and we show some experimental results as proofs-of-concept.
