Faster Algorithm for Structured John Ellipsoid Computation

Yang Cao; Xiaoyu Li; Zhao Song; Xin Yang; Tianyi Zhou

Faster Algorithm for Structured John Ellipsoid Computation

Yang Cao, Xiaoyu Li, Zhao Song, Xin Yang, Tianyi Zhou

TL;DR

The paper addresses the computational challenge of finding the John Ellipsoid inside a centrally symmetric polytope, formalized as $P = \{ x : -\mathbf{1}_n \le Ax \le \mathbf{1}_n \}$ with rank-$d$ matrix $A$. It introduces two fast approximation methods: a sketching-based algorithm that achieves near input-sparsity time with per-iteration cost $\widetilde{O}(\epsilon^{-1}\mathrm{nnz}(A) + \epsilon^{-2} d^{\omega})$ and total iterations $T = O(\epsilon^{-1}\log(n/d))$, and a treewidth-based algorithm that runs in $O(n\tau^2)$ per iteration when the dual graph of $A$ has treewidth $\tau$. Both approaches preserve containment relations $\frac{1}{\sqrt{1+\epsilon}} Q \subseteq P \subseteq \sqrt{d}\,Q$ for the computed ellipsoid $Q$, improving upon the previous $\widetilde{O}(n d^2)$ per-iteration cost. The methods rely on leveraging score approximations, sampling, and sketching in the input-sparsity setting, and on a $\tau$-sparse Cholesky-based decomposition in the small-treewidth setting, with a telescoping analysis to bound error accumulation. Together, these results yield practically faster algorithms for computing near-optimal John Ellipsoids, with implications for D-optimal design and related convex-optimization tasks in high dimensions.

Abstract

The famous theorem of Fritz John states that any convex body has a unique maximal volume inscribed ellipsoid, known as the John Ellipsoid. Computing the John Ellipsoid is a fundamental problem in convex optimization. In this paper, we focus on approximating the John Ellipsoid inscribed in a convex and centrally symmetric polytope defined by $ P := \{ x \in \mathbb{R}^d : -\mathbf{1}_n \leq A x \leq \mathbf{1}_n \},$ where $ A \in \mathbb{R}^{n \times d} $ is a rank-$d$ matrix and $ \mathbf{1}_n \in \mathbb{R}^n $ is the all-ones vector. We develop two efficient algorithms for approximating the John Ellipsoid. The first is a sketching-based algorithm that runs in nearly input-sparsity time $ \widetilde{O}(\mathrm{nnz}(A) + d^ω) $, where $ \mathrm{nnz}(A) $ denotes the number of nonzero entries in the matrix $A$ and $ ω\approx 2.37$ is the current matrix multiplication exponent. The second is a treewidth-based algorithm that runs in time $ \widetilde{O}(n τ^2)$, where $τ$ is the treewidth of the dual graph of the matrix $A$. Our algorithms significantly improve upon the state-of-the-art running time of $ \widetilde{O}(n d^2) $ achieved by [Cohen, Cousins, Lee, and Yang, COLT 2019].

Faster Algorithm for Structured John Ellipsoid Computation

TL;DR

The paper addresses the computational challenge of finding the John Ellipsoid inside a centrally symmetric polytope, formalized as

with rank-

matrix

. It introduces two fast approximation methods: a sketching-based algorithm that achieves near input-sparsity time with per-iteration cost

and total iterations

, and a treewidth-based algorithm that runs in

per iteration when the dual graph of

has treewidth

. Both approaches preserve containment relations

for the computed ellipsoid

, improving upon the previous

per-iteration cost. The methods rely on leveraging score approximations, sampling, and sketching in the input-sparsity setting, and on a

-sparse Cholesky-based decomposition in the small-treewidth setting, with a telescoping analysis to bound error accumulation. Together, these results yield practically faster algorithms for computing near-optimal John Ellipsoids, with implications for D-optimal design and related convex-optimization tasks in high dimensions.

Abstract

where

is a rank-

matrix and

is the all-ones vector. We develop two efficient algorithms for approximating the John Ellipsoid. The first is a sketching-based algorithm that runs in nearly input-sparsity time

, where

denotes the number of nonzero entries in the matrix

and

is the current matrix multiplication exponent. The second is a treewidth-based algorithm that runs in time

, where

is the treewidth of the dual graph of the matrix

. Our algorithms significantly improve upon the state-of-the-art running time of

achieved by [Cohen, Cousins, Lee, and Yang, COLT 2019].

Paper Structure (45 sections, 30 theorems, 74 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 45 sections, 30 theorems, 74 equations, 4 figures, 1 table, 1 algorithm.

Introduction
Algorithm in Nearly Input-Sparsity Time
Algorithm for Small Treewidth
Preliminaries
Notations
Leverage Score
Treewidth
Cholesky Factorization
Matrix Concentration
Problem Formulation
Technical Overview
Overview of Previous Work
Algorithm in Nearly Input-Sparsity Time
Fixed Point Iteration.
Leverage Score Sampling.
...and 30 more sections

Key Result

Theorem 1.1

Given a matrix $A \in \mathbb{R}^{n \times d}$, let a symmetric convex polytope be defined as $P := \{ x \in \mathbb{R}^d : -{\bf 1}_n \leq A x \leq {\bf 1}_n \}$. For any $\epsilon, \delta \in (0, 0.1)$, where $\delta$ denotes the failure probability, there exists a randomized algorithm (Algorithm Moreover, it runs within $O(\epsilon^{-1} \log(n/d))$ iterations and each iteration takes $\widetil

Figures (4)

Figure 1: Time complexity comparison between CCLY19 (denotes ccly19) and ours, assuming $n = d^a,~\epsilon= \Theta(1)$, and ignoring the $\log$ factors. The $x$-axis is corresponding to $a$ and $y$-axis is corresponding to $b$. The $n^b$ is the total running time.
Figure 2: The geometric interpretation of the output ellipsoid. Let $P$ be a given input polytope. We can find an ellipsoid $Q$ so that ${\color{myblue}\frac{1}{\sqrt{1+\epsilon}} Q} \subseteq {\color{orange}P} \subseteq {\color{red}\sqrt{d} \cdot Q}$.
Figure 3: (a) A graph $G(V,E)$ (b) The tree decomposition for graph $G$. We can see that the union of the vertices in all bags are nodes $a, \cdots ,i$, which is the same as $V(G)$. For every edge $u,v \in V(G)$, we can find at least one bag containing $u$ and $v$. For example, for edge $(c,b)$ in graph $G$, bag $1$ contains both $c$ and $b$. Furthermore, the bags containing any one node in $(a)$ is a subgraph of tree $(b)$. For example, the bags containing node $c$ are bags $1,2,3$, which is a subgraph of the tree. Similarly, we can see that the bags containing node $f$ is bags $3,5$, which is also a subgraph of the tree. For edge $(c,f)$, bag $2$ and $3$ both contain vertices $c$ and $f$. For edge $(i,g)$, bag $5$ contains vertices $i$ and $g$.
Figure 4: (a) A $10 \times 10$ positive definite matrix $P = AA^{\top}$, where the blue dot represent the non-zero elements in $P$. (b) The Cholesky factor $L$ of $AA^{\top}$. (c) The corresponding elimination tree for matrix $P$, where each node represent one column in the Cholesky factor. We can see that, as the row index of the first subdiagonal nonzero entry of the $6$-th column is 8, the parent of node $6$ is $8$. Furthermore, the non-zero pattern of this coloumn is $\{6,8,10\}$, which is a subset of vertices on the path from node $6$ to the root in the elimination tree.

Theorems & Definitions (57)

Theorem 1.1: Main result I, input-sparsity time
Theorem 1.2: Main result II, small treewidth
Definition 2.1: Leverage score
Lemma 2.2: Folklore
Lemma 2.3: Leverage score computation, Lemma 4.3 in dsw22
Definition 2.4: Tree decomposition and tree width of a graph bghk95d06lms13
Definition 2.5: Dual graph
Definition 2.6: Cholesky factorization
Lemma 2.7: Fast Cholesky factorization bghk95d06
Remark 2.8
...and 47 more

Faster Algorithm for Structured John Ellipsoid Computation

TL;DR

Abstract

Faster Algorithm for Structured John Ellipsoid Computation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (57)