Table of Contents
Fetching ...

Parallel and (Nearly) Work-Efficient Dynamic Programming

Xiangyun Ding, Yan Gu, Yihan Sun

TL;DR

This work addresses the challenge of achieving nearly work-efficient parallel dynamic programming by introducing the Cordon Algorithm, a framework that identifies a frontier of ready DP states and processes them in parallel while preserving correctness. By applying this framework to problems such as LIS, LCS, convex/concave GLWS, OAT, and GAP, the authors obtain near-optimal work with nontrivial parallelism and, in several cases, polylogarithmic span. The results unify and extend optimized sequential DP techniques (e.g., decision monotonicity) into parallel settings, delivering both theoretical bounds and practical proofs-of-concept. Experimental results on large-scale instances demonstrate substantial speedups over optimized sequential implementations, underscoring the framework’s practical impact for real-world DP problems.

Abstract

The idea of dynamic programming (DP), proposed by Bellman in the 1950s, is one of the most important algorithmic techniques. However, in parallel, many fundamental and sequentially simple problems become more challenging, and open to a (nearly) work-efficient solution (i.e., the work is off by at most a polylogarithmic factor over the best sequential solution). In fact, sequential DP algorithms employ many advanced optimizations such as decision monotonicity or special data structures, and achieve better work than straightforward solutions. Many such optimizations are inherently sequential, which creates extra challenges for a parallel algorithm to achieve the same work bound. The goal of this paper is to achieve (nearly) work-efficient parallel DP algorithms by parallelizing classic, highly-optimized and practical sequential algorithms. We show a general framework called the Cordon Algorithm for parallel DP algorithms, and use it to solve several classic problems. Our selection of problems includes Longest Increasing Subsequence (LIS), sparse Longest Common Subsequence (LCS), convex/concave generalized Least Weight Subsequence (LWS), Optimal Alphabetic Tree (OAT), and more. We show how the Cordon Algorithm can be used to achieve the same level of optimization as the sequential algorithms, and achieve good parallelism. Many of our algorithms are conceptually simple, and we show some experimental results as proofs-of-concept.

Parallel and (Nearly) Work-Efficient Dynamic Programming

TL;DR

This work addresses the challenge of achieving nearly work-efficient parallel dynamic programming by introducing the Cordon Algorithm, a framework that identifies a frontier of ready DP states and processes them in parallel while preserving correctness. By applying this framework to problems such as LIS, LCS, convex/concave GLWS, OAT, and GAP, the authors obtain near-optimal work with nontrivial parallelism and, in several cases, polylogarithmic span. The results unify and extend optimized sequential DP techniques (e.g., decision monotonicity) into parallel settings, delivering both theoretical bounds and practical proofs-of-concept. Experimental results on large-scale instances demonstrate substantial speedups over optimized sequential implementations, underscoring the framework’s practical impact for real-world DP problems.

Abstract

The idea of dynamic programming (DP), proposed by Bellman in the 1950s, is one of the most important algorithmic techniques. However, in parallel, many fundamental and sequentially simple problems become more challenging, and open to a (nearly) work-efficient solution (i.e., the work is off by at most a polylogarithmic factor over the best sequential solution). In fact, sequential DP algorithms employ many advanced optimizations such as decision monotonicity or special data structures, and achieve better work than straightforward solutions. Many such optimizations are inherently sequential, which creates extra challenges for a parallel algorithm to achieve the same work bound. The goal of this paper is to achieve (nearly) work-efficient parallel DP algorithms by parallelizing classic, highly-optimized and practical sequential algorithms. We show a general framework called the Cordon Algorithm for parallel DP algorithms, and use it to solve several classic problems. Our selection of problems includes Longest Increasing Subsequence (LIS), sparse Longest Common Subsequence (LCS), convex/concave generalized Least Weight Subsequence (LWS), Optimal Alphabetic Tree (OAT), and more. We show how the Cordon Algorithm can be used to achieve the same level of optimization as the sequential algorithms, and achieve good parallelism. Many of our algorithms are conceptually simple, and we show some experimental results as proofs-of-concept.
Paper Structure (28 sections, 18 theorems, 13 equations, 7 figures, 2 algorithms)

This paper contains 28 sections, 18 theorems, 13 equations, 7 figures, 2 algorithms.

Key Result

Theorem 2.1

The Cordon Algorithm is correct.

Figures (7)

  • Figure 1: Convex and concave decision monotonicity. (a). Convexity: for two states $i<j$, their best decisions satisfy $i^*\le j^*$. (b). Concavity: for two states $i<j$, their best decisions satisfy either $j^*\le i^*$ or $j^*\ge i$.
  • Figure 2: Illustrations for the LIS/LCS problem and the Cordon Algorithm. Subfigure (a): An input sequence for LIS with the DP value of each element. Subfigure (b): A geometric view of the input sequence on a 2D plane with each element represented as $(i, A_i)$. Subfigure (c): The corresponding LCS on this input---the answer is the longest path from $(0,0)$ to $(8,8)$ using the maximum number of red edges. Subfigure (d): The process to compute the second cordon. The ready states are marked in the shaded gray region. The cordon is decided by the three cells with LIS=2 in the original input. Subfigure (e): A general LCS case where every diagonal can be a red edge. Subfigure (f): An example execution of our LCS algorithm. Here we only show unready states for simplicity.
  • Figure 3: Example of applying the Cordon Algorithm to the post office problem with convex cost function. Circles (states) are villages. Arrows are best decisions between states. The final answer is four post offices serving villages 1--3, 4--7, 8--9, 10--12, respectively. The subrounds below illustrate the prefix-doubling scheme in FindCordon.
  • Figure 4: Example of a cordon in the GAP problem.
  • Figure 5: Cordon Algorithm on a tree. Note that the sibling nodes must have the same status.
  • ...and 2 more figures

Theorems & Definitions (18)

  • Theorem 2.1
  • Theorem 3.1
  • Theorem 3.2
  • Lemma 4.1
  • Lemma 4.2
  • Theorem 4.1
  • Theorem 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Lemma 4.5
  • ...and 8 more