Table of Contents
Fetching ...

Exploiting block triangular submatrices in KKT systems

Robert Parker, Manuel Garcia, Russell Bent

Abstract

We propose a method for solving Karush-Kuhn-Tucker (KKT) systems that exploits block triangular submatrices by first using a Schur complement decomposition to isolate the block triangular submatrices then performing a block backsolve where only diagonal blocks of the block triangular form need to be factorized. We show that factorizing reducible symmetric-indefinite matrices with standard 1$\times$1 or 2$\times$2 pivots yields fill-in outside the diagonal blocks of the block triangular form, in contrast to our proposed method. While exploiting a block triangular submatrix has limited fill-in, unsymmetric matrix factorization methods do not reveal inertia, which is required by interior point methods for nonconvex optimization. We show that our target matrix has inertia that is known \textit{a priori}, letting us compute inertia of the KKT matrix by Sylvester's law. Finally, we demonstrate the computational advantage of this method on KKT systems from optimization problems with neural network surrogates in their constraints. Our method achieves up to 15$\times$ speedups over state-of-the-art symmetric indefinite matrix factorization methods MA57 and MA86 in a constant-hardware comparison.

Exploiting block triangular submatrices in KKT systems

Abstract

We propose a method for solving Karush-Kuhn-Tucker (KKT) systems that exploits block triangular submatrices by first using a Schur complement decomposition to isolate the block triangular submatrices then performing a block backsolve where only diagonal blocks of the block triangular form need to be factorized. We show that factorizing reducible symmetric-indefinite matrices with standard 11 or 22 pivots yields fill-in outside the diagonal blocks of the block triangular form, in contrast to our proposed method. While exploiting a block triangular submatrix has limited fill-in, unsymmetric matrix factorization methods do not reveal inertia, which is required by interior point methods for nonconvex optimization. We show that our target matrix has inertia that is known \textit{a priori}, letting us compute inertia of the KKT matrix by Sylvester's law. Finally, we demonstrate the computational advantage of this method on KKT systems from optimization problems with neural network surrogates in their constraints. Our method achieves up to 15 speedups over state-of-the-art symmetric indefinite matrix factorization methods MA57 and MA86 in a constant-hardware comparison.
Paper Structure (23 sections, 5 theorems, 33 equations, 5 figures, 7 tables, 2 algorithms)

This paper contains 23 sections, 5 theorems, 33 equations, 5 figures, 7 tables, 2 algorithms.

Key Result

Theorem 3.1

Let $C$ be the matrix in the right-hand side of blocktriangular:eqn:c-partition. A $1\times 1$ pivot on $d_{11}$ may result in fill-in everywhere except the last block row and column.

Figures (5)

  • Figure 1: Incidence matrix structure of neural network constraints. Rows are equations and columns are output or intermediate variables. Each box along the diagonal contains the variables (and constraints) for a particular layer of the network, i.e., $y_l$, $z_l$, and the associated constraints from \ref{['blocktriangular:eqn:nn-full']}. The dashed boxes under the diagonal contain the nonzeros corresponding to the links between these variables and the variables of the next layer. That, is they contain the $W_l$ matrices from \ref{['blocktriangular:eqn:nn-full']}.
  • Figure 1: Permutation of $C$ to block triangular form
  • Figure 1: Structure of KKT, pivot, and Schur complement matrices for the smallest instance of the MNIST test problem. The KKT matrix is in the order shown in \ref{['blocktriangular:eqn:kkt-xy']}.
  • Figure 2: Structure of KKT, pivot, and Schur complement matrices for the smallest instance of the SCOPF test problem.
  • Figure 3: Structure of KKT, pivot, and Schur complement matrices for the smallest instance of the LSV test problem.

Theorems & Definitions (9)

  • Theorem 3.1
  • Proof 1: Proof of \ref{['blocktriangular:thm:1by1']}
  • Theorem 3.2
  • Proof 2: Proof of \ref{['blocktriangular:thm:2by2']}
  • Theorem 3.3
  • Proof 3: Proof of \ref{['blocktriangular:thm:inertia']}
  • Theorem 3.4
  • Theorem 3.5
  • Proof 4: Proof of \ref{['blocktriangular:thm:regularization']}