Scalable Approximate Optimal Diagonal Preconditioning

Wenzhi Gao; Zhaonan Qu; Madeleine Udell; Yinyu Ye

Scalable Approximate Optimal Diagonal Preconditioning

Wenzhi Gao, Zhaonan Qu, Madeleine Udell, Yinyu Ye

TL;DR

This work considers the problem of finding the optimal diagonal preconditioner for a positive definite matrix, and proposes practical algorithms applicable to finding approximate optimal diagonal preconditioners of large sparse systems based on the idea of dimension reduction.

Abstract

We consider the problem of finding the optimal diagonal preconditioner for a positive definite matrix. Although this problem has been shown to be solvable and various methods have been proposed, none of the existing approaches are scalable to matrices of large dimension, or when access is limited to black-box matrix-vector products, thereby significantly limiting their practical application. In view of these challenges, we propose practical algorithms applicable to finding approximate optimal diagonal preconditioners of large sparse systems. Our approach is based on the idea of dimension reduction, and combines techniques from semi-definite programming (SDP), random projection, semi-infinite programming (SIP), and column generation. Numerical experiments demonstrate that our method scales to sparse matrices of size greater than $10^7$. Notably, our approach is efficient and implementable using only black-box matrix-vector product operations, making it highly practical for a wide variety of applications.

Scalable Approximate Optimal Diagonal Preconditioning

TL;DR

Abstract

. Notably, our approach is efficient and implementable using only black-box matrix-vector product operations, making it highly practical for a wide variety of applications.

Paper Structure (43 sections, 14 theorems, 48 equations, 11 figures, 2 tables, 2 algorithms)

This paper contains 43 sections, 14 theorems, 48 equations, 11 figures, 2 tables, 2 algorithms.

Introduction
Contributions
Structure of the paper
Preliminaries
Notation.
Optimal diagonal preconditioner in a subspace
Optimal diagonal preconditioner in a deterministic subspace
Optimal diagonal preconditioner in a randomized subspace
A semi-infinite approach to the optimal preconditioning SDP
Preconditioner optimization via column (matrix) generation
Dual interpretation of optimal diagonal preconditioning
Column generation for the optimal preconditioning SDP
Semi-infinite duality and dual solution retrieval
Dimension reduction for column generation
Implications of dimension reduction and extensions
...and 28 more sections

Key Result

Theorem 2.1

Given a positive definite matrix $\mathbf{M} \in \mathbb{S}_{+ +}^n$, the optimal diagonal preconditioner $\mathbf{D}^\star = \operatorname{diagm}(\mathbf{d}^\star)$ and solution to eqn:formulation can be obtained by solving the semi-definite optimization problem with variables $\mathbf{d} \in \mathbb{R}^n_{++}$ and $\tau \in \mathbb{R}$. The solution $\tau^* = \kappa^{-1}$ equals the inverse con

Figures (11)

Figure 1: Left: an illustration of the dual problem. Given $\mathbf{X}_2$ on the hyperplane $\langle \mathbf{M}, \mathbf{X}_2 \rangle =1$, we seek $\mathbf{X}_1$ such that 1). $\mathbf{X}_1 - \mathbf{X}_2$ is orthogonal to each $\mathbf{D}_i$ and 2). $\mathbf{X}_1$ minimizes $\langle \mathbf{M}, \mathbf{X}_1 \rangle$. Right: the optimal solution.
Figure 2: Iterating in the space of preconditioners. In each iteration, we obtain the best linear combination between the best preconditioner so far $\mathbf{d}_1$ and an improving direction $\mathbf{d}_2$. After solving the SIP, we update the best preconditioner to $\mathbf{d}_1 \leftarrow \mathbf{d}^{\text{best}}$ and get a new improving direction $\mathbf{d}_2 \leftarrow \hat{\mathbf{d}}$ from the pricing problem.
Figure 3: Condition number and randomized subspace dimension $k$. First two columns: uniform distribution; Last two columns: normal distribution; x-axis is subspace dimension $k$; y-axis is condition number.
Figure 4: Convergence behavior of the SIP approach to solving \ref{['eqn:odp-subspace']}. From left to right: matrices of $(n, \sigma) \in \{(10^2, 10^{-1}), (10^4, 10^{-4}), (10^6, 10^{-6}), (10^7, 5\times10^{-8})\}$. x-axis: number of cutting plane iterations; y-axis, the infeasibility of the two SDP blocks, measured by the minimum eigenvalue of two SDP blocks.
Figure 5: Distribution of condition number for different preconditioners.
...and 6 more figures

Theorems & Definitions (20)

Theorem 2.1: SDP formulation of optimal diagonal preconditioning qu2022optimal
Theorem 3.1
Corollary 1
Remark 1
Lemma 3.1: Approximate optimality: medium $k$
Lemma 3.2: Approximate optimality, large $k$
Theorem 3.2
Remark 2
Remark 3
Lemma 4.1
...and 10 more

Scalable Approximate Optimal Diagonal Preconditioning

TL;DR

Abstract

Scalable Approximate Optimal Diagonal Preconditioning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (20)