Table of Contents
Fetching ...

Parallel KKT Solver in PIQP for Multistage Optimization

Fenglong Song, Roland Schwan, Yuwen Chen, Colin N. Jones

TL;DR

This work tackles the computational bottleneck of solving KKT systems in multistage OCPs by introducing a parallel KKT solver that operates directly on the linear-algebra structure. It extends a permutation-based, two-phase parallel Cholesky factorization to block-tridiagonal-arrow KKT matrices and couples it with a parallel forward–backward substitution strategy, all implemented as a new PIQP backend with OpenMP acceleration. The approach preserves numerical robustness and demonstrates substantial runtime reductions on chain-of-masses and minimum-curvature race line benchmarks, achieving up to about $3.6\times$ overall speedups over the sequential multistage solver and up to $2.7\times$ gains over competing solvers on large horizons. The method lays a path toward GPU acceleration for $\mathcal{O}(\log N)$ scaling once sufficient parallel resources are available, potentially broadening real-time applicability for complex multistage OCPs.$

Abstract

This paper presents an efficient parallel Cholesky factorization and triangular solve algorithm for the Karush-Kuhn-Tucker (KKT) systems arising in multistage optimization problems, with a focus on model predictive control and trajectory optimization for racing. The proposed approach directly parallelizes solving the KKT systems with block-tridiagonal-arrow KKT matrices on the linear algebra level arising in interior-point methods. The algorithm is implemented as a new backend of the PIQP solver and released as open source. Numerical experiments on the chain-of-masses benchmarks and a minimum curvature race line optimization problem demonstrate substantial performance gains compared to other state-of-the-art solvers.

Parallel KKT Solver in PIQP for Multistage Optimization

TL;DR

This work tackles the computational bottleneck of solving KKT systems in multistage OCPs by introducing a parallel KKT solver that operates directly on the linear-algebra structure. It extends a permutation-based, two-phase parallel Cholesky factorization to block-tridiagonal-arrow KKT matrices and couples it with a parallel forward–backward substitution strategy, all implemented as a new PIQP backend with OpenMP acceleration. The approach preserves numerical robustness and demonstrates substantial runtime reductions on chain-of-masses and minimum-curvature race line benchmarks, achieving up to about overall speedups over the sequential multistage solver and up to gains over competing solvers on large horizons. The method lays a path toward GPU acceleration for scaling once sufficient parallel resources are available, potentially broadening real-time applicability for complex multistage OCPs.$

Abstract

This paper presents an efficient parallel Cholesky factorization and triangular solve algorithm for the Karush-Kuhn-Tucker (KKT) systems arising in multistage optimization problems, with a focus on model predictive control and trajectory optimization for racing. The proposed approach directly parallelizes solving the KKT systems with block-tridiagonal-arrow KKT matrices on the linear algebra level arising in interior-point methods. The algorithm is implemented as a new backend of the PIQP solver and released as open source. Numerical experiments on the chain-of-masses benchmarks and a minimum curvature race line optimization problem demonstrate substantial performance gains compared to other state-of-the-art solvers.

Paper Structure

This paper contains 17 sections, 28 equations, 4 figures, 3 tables, 3 algorithms.

Figures (4)

  • Figure 1: The KKT matrix before and after permutation.
  • Figure 2: The Cholesky factorization of $\hat{\Psi}$ s.t. $\hat{\Psi} = \hat{L}\hat{L}^\top$ and the permuted right-hand side $\hat{r}\coloneqq P r$.
  • Figure 3: Theoretical speedups using our parallel method compared to the sequential method. The gray region indicates the condition $N\geq2p$ is not satisfied, a constraint required for the parallel method across $p$ threads to be meaningful.
  • Figure 4: Benchmark results for the chain-of-masses OCP with varying horizons $N$ and numbers of threads $p$.