Exploiting Parallelism in a QPALM-based Solver for Optimal Control

Pieter Pas; Kristoffer Fink Løwenstein; Daniele Bernardini; Panagiotis Patrinos

Exploiting Parallelism in a QPALM-based Solver for Optimal Control

Pieter Pas, Kristoffer Fink Løwenstein, Daniele Bernardini, Panagiotis Patrinos

Abstract

We discuss the opportunities for parallelization in the recently proposed QPALM-OCP algorithm, a solver tailored to quadratic programs arising in optimal control. A significant part of the computational work can be carried out independently for the different stages in the optimal control problem. We exploit this specific structure to apply parallelization and vectorization techniques in an optimized C++ implementation of the method. Results for optimal control benchmark problems and comparisons to the original QPALM method are provided.

Exploiting Parallelism in a QPALM-based Solver for Optimal Control

Abstract

Paper Structure (15 sections, 10 equations, 3 figures, 1 table)

This paper contains 15 sections, 10 equations, 3 figures, 1 table.

Introduction
Problem formulation
QPALM-OCP
The augmented Lagrangian inner problem
Solving the semismooth Newton system
Vectorization
Compact storage format
Vectorized linear algebra routines for matrices in compact storage
Parallelization
Numerical examples
Setup
Spring-mass benchmark
Effectiveness of parallelization
MPC qpbenchmark
Conclusion

Figures (3)

Figure 1: Comparison between "naive" (left) and "compact" storage (right) of matrices $A_j$ for a problem of size $N = 4 = n_x$ and a vector length of $d=2$. In the naive format, each matrix $A_j$ is stored contiguously in memory (e.g. column major order), and matrix $A_{j+1}$ of the next stage is stored after $A_j$. In the compact format, matrices $A_{2k}$ and $A_{2k+1}$ are interleaved in such a way that their elements $(A_{2k})_{\imath\jmath}$ and $(A_{2k+1})_{\imath\jmath}$ are stored next to each other. Alternatively, the naive format can be seen as a $n_x \times n_x \times N$ tensor where the stage number $j < N$ is the index with the largest stride. In contrast, the compact format would be a $n_x \times n_x \times \lceil N/d \rceil$ tensor where each element is a tuple of $d$ elements.
Figure 2: Average solver run times for different numbers of masses, horizon $N=15$. Shaded areas indicate best and worst run times.
Figure 3: Effect of parallelization and vectorization for different horizon length, masses $M=30$.

Exploiting Parallelism in a QPALM-based Solver for Optimal Control

Abstract

Exploiting Parallelism in a QPALM-based Solver for Optimal Control

Authors

Abstract

Table of Contents

Figures (3)