Exploiting block triangular submatrices in KKT systems

Robert Parker; Manuel Garcia; Russell Bent

Exploiting block triangular submatrices in KKT systems

Robert Parker, Manuel Garcia, Russell Bent

Abstract

We propose a method for solving Karush-Kuhn-Tucker (KKT) systems that exploits block triangular submatrices by first using a Schur complement decomposition to isolate the block triangular submatrices then performing a block backsolve where only diagonal blocks of the block triangular form need to be factorized. We show that factorizing reducible symmetric-indefinite matrices with standard 1$\times$1 or 2$\times$2 pivots yields fill-in outside the diagonal blocks of the block triangular form, in contrast to our proposed method. While exploiting a block triangular submatrix has limited fill-in, unsymmetric matrix factorization methods do not reveal inertia, which is required by interior point methods for nonconvex optimization. We show that our target matrix has inertia that is known \textit{a priori}, letting us compute inertia of the KKT matrix by Sylvester's law. Finally, we demonstrate the computational advantage of this method on KKT systems from optimization problems with neural network surrogates in their constraints. Our method achieves up to 15$\times$ speedups over state-of-the-art symmetric indefinite matrix factorization methods MA57 and MA86 in a constant-hardware comparison.

Exploiting block triangular submatrices in KKT systems

Abstract

1 or 2

2 pivots yields fill-in outside the diagonal blocks of the block triangular form, in contrast to our proposed method. While exploiting a block triangular submatrix has limited fill-in, unsymmetric matrix factorization methods do not reveal inertia, which is required by interior point methods for nonconvex optimization. We show that our target matrix has inertia that is known \textit{a priori}, letting us compute inertia of the KKT matrix by Sylvester's law. Finally, we demonstrate the computational advantage of this method on KKT systems from optimization problems with neural network surrogates in their constraints. Our method achieves up to 15

speedups over state-of-the-art symmetric indefinite matrix factorization methods MA57 and MA86 in a constant-hardware comparison.

Paper Structure (23 sections, 5 theorems, 33 equations, 5 figures, 7 tables, 2 algorithms)

This paper contains 23 sections, 5 theorems, 33 equations, 5 figures, 7 tables, 2 algorithms.

Introduction
Background
Interior point methods
Inertia
Schur complement
Block triangular decomposition
Graphs of sparse matrices
Neural networks
Exploiting block triangularity within the Shur complement
Our approach
Application to neural network submatrices
Fill-in
Inertia and regularization
Test problems
Adversarial image generation (MNIST)
...and 8 more sections

Key Result

Theorem 3.1

Let $C$ be the matrix in the right-hand side of blocktriangular:eqn:c-partition. A $1\times 1$ pivot on $d_{11}$ may result in fill-in everywhere except the last block row and column.

Figures (5)

Figure 1: Incidence matrix structure of neural network constraints. Rows are equations and columns are output or intermediate variables. Each box along the diagonal contains the variables (and constraints) for a particular layer of the network, i.e., $y_l$, $z_l$, and the associated constraints from \ref{['blocktriangular:eqn:nn-full']}. The dashed boxes under the diagonal contain the nonzeros corresponding to the links between these variables and the variables of the next layer. That, is they contain the $W_l$ matrices from \ref{['blocktriangular:eqn:nn-full']}.
Figure 1: Permutation of $C$ to block triangular form
Figure 1: Structure of KKT, pivot, and Schur complement matrices for the smallest instance of the MNIST test problem. The KKT matrix is in the order shown in \ref{['blocktriangular:eqn:kkt-xy']}.
Figure 2: Structure of KKT, pivot, and Schur complement matrices for the smallest instance of the SCOPF test problem.
Figure 3: Structure of KKT, pivot, and Schur complement matrices for the smallest instance of the LSV test problem.

Theorems & Definitions (9)

Theorem 3.1
Proof 1: Proof of \ref{['blocktriangular:thm:1by1']}
Theorem 3.2
Proof 2: Proof of \ref{['blocktriangular:thm:2by2']}
Theorem 3.3
Proof 3: Proof of \ref{['blocktriangular:thm:inertia']}
Theorem 3.4
Theorem 3.5
Proof 4: Proof of \ref{['blocktriangular:thm:regularization']}

Exploiting block triangular submatrices in KKT systems

Abstract

Exploiting block triangular submatrices in KKT systems

Authors

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (9)