Table of Contents
Fetching ...

Faster Algorithm for Structured John Ellipsoid Computation

Yang Cao, Xiaoyu Li, Zhao Song, Xin Yang, Tianyi Zhou

TL;DR

The paper addresses the computational challenge of finding the John Ellipsoid inside a centrally symmetric polytope, formalized as $P = \{ x : -\mathbf{1}_n \le Ax \le \mathbf{1}_n \}$ with rank-$d$ matrix $A$. It introduces two fast approximation methods: a sketching-based algorithm that achieves near input-sparsity time with per-iteration cost $\widetilde{O}(\epsilon^{-1}\mathrm{nnz}(A) + \epsilon^{-2} d^{\omega})$ and total iterations $T = O(\epsilon^{-1}\log(n/d))$, and a treewidth-based algorithm that runs in $O(n\tau^2)$ per iteration when the dual graph of $A$ has treewidth $\tau$. Both approaches preserve containment relations $\frac{1}{\sqrt{1+\epsilon}} Q \subseteq P \subseteq \sqrt{d}\,Q$ for the computed ellipsoid $Q$, improving upon the previous $\widetilde{O}(n d^2)$ per-iteration cost. The methods rely on leveraging score approximations, sampling, and sketching in the input-sparsity setting, and on a $\tau$-sparse Cholesky-based decomposition in the small-treewidth setting, with a telescoping analysis to bound error accumulation. Together, these results yield practically faster algorithms for computing near-optimal John Ellipsoids, with implications for D-optimal design and related convex-optimization tasks in high dimensions.

Abstract

The famous theorem of Fritz John states that any convex body has a unique maximal volume inscribed ellipsoid, known as the John Ellipsoid. Computing the John Ellipsoid is a fundamental problem in convex optimization. In this paper, we focus on approximating the John Ellipsoid inscribed in a convex and centrally symmetric polytope defined by $ P := \{ x \in \mathbb{R}^d : -\mathbf{1}_n \leq A x \leq \mathbf{1}_n \},$ where $ A \in \mathbb{R}^{n \times d} $ is a rank-$d$ matrix and $ \mathbf{1}_n \in \mathbb{R}^n $ is the all-ones vector. We develop two efficient algorithms for approximating the John Ellipsoid. The first is a sketching-based algorithm that runs in nearly input-sparsity time $ \widetilde{O}(\mathrm{nnz}(A) + d^ω) $, where $ \mathrm{nnz}(A) $ denotes the number of nonzero entries in the matrix $A$ and $ ω\approx 2.37$ is the current matrix multiplication exponent. The second is a treewidth-based algorithm that runs in time $ \widetilde{O}(n τ^2)$, where $τ$ is the treewidth of the dual graph of the matrix $A$. Our algorithms significantly improve upon the state-of-the-art running time of $ \widetilde{O}(n d^2) $ achieved by [Cohen, Cousins, Lee, and Yang, COLT 2019].

Faster Algorithm for Structured John Ellipsoid Computation

TL;DR

The paper addresses the computational challenge of finding the John Ellipsoid inside a centrally symmetric polytope, formalized as with rank- matrix . It introduces two fast approximation methods: a sketching-based algorithm that achieves near input-sparsity time with per-iteration cost and total iterations , and a treewidth-based algorithm that runs in per iteration when the dual graph of has treewidth . Both approaches preserve containment relations for the computed ellipsoid , improving upon the previous per-iteration cost. The methods rely on leveraging score approximations, sampling, and sketching in the input-sparsity setting, and on a -sparse Cholesky-based decomposition in the small-treewidth setting, with a telescoping analysis to bound error accumulation. Together, these results yield practically faster algorithms for computing near-optimal John Ellipsoids, with implications for D-optimal design and related convex-optimization tasks in high dimensions.

Abstract

The famous theorem of Fritz John states that any convex body has a unique maximal volume inscribed ellipsoid, known as the John Ellipsoid. Computing the John Ellipsoid is a fundamental problem in convex optimization. In this paper, we focus on approximating the John Ellipsoid inscribed in a convex and centrally symmetric polytope defined by where is a rank- matrix and is the all-ones vector. We develop two efficient algorithms for approximating the John Ellipsoid. The first is a sketching-based algorithm that runs in nearly input-sparsity time , where denotes the number of nonzero entries in the matrix and is the current matrix multiplication exponent. The second is a treewidth-based algorithm that runs in time , where is the treewidth of the dual graph of the matrix . Our algorithms significantly improve upon the state-of-the-art running time of achieved by [Cohen, Cousins, Lee, and Yang, COLT 2019].
Paper Structure (45 sections, 30 theorems, 74 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 45 sections, 30 theorems, 74 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Theorem 1.1

Given a matrix $A \in \mathbb{R}^{n \times d}$, let a symmetric convex polytope be defined as $P := \{ x \in \mathbb{R}^d : -{\bf 1}_n \leq A x \leq {\bf 1}_n \}$. For any $\epsilon, \delta \in (0, 0.1)$, where $\delta$ denotes the failure probability, there exists a randomized algorithm (Algorithm Moreover, it runs within $O(\epsilon^{-1} \log(n/d))$ iterations and each iteration takes $\widetil

Figures (4)

  • Figure 1: Time complexity comparison between CCLY19 (denotes ccly19) and ours, assuming $n = d^a,~\epsilon= \Theta(1)$, and ignoring the $\log$ factors. The $x$-axis is corresponding to $a$ and $y$-axis is corresponding to $b$. The $n^b$ is the total running time.
  • Figure 2: The geometric interpretation of the output ellipsoid. Let $P$ be a given input polytope. We can find an ellipsoid $Q$ so that ${\color{myblue}\frac{1}{\sqrt{1+\epsilon}} Q} \subseteq {\color{orange}P} \subseteq {\color{red}\sqrt{d} \cdot Q}$.
  • Figure 3: (a) A graph $G(V,E)$ (b) The tree decomposition for graph $G$. We can see that the union of the vertices in all bags are nodes $a, \cdots ,i$, which is the same as $V(G)$. For every edge $u,v \in V(G)$, we can find at least one bag containing $u$ and $v$. For example, for edge $(c,b)$ in graph $G$, bag $1$ contains both $c$ and $b$. Furthermore, the bags containing any one node in $(a)$ is a subgraph of tree $(b)$. For example, the bags containing node $c$ are bags $1,2,3$, which is a subgraph of the tree. Similarly, we can see that the bags containing node $f$ is bags $3,5$, which is also a subgraph of the tree. For edge $(c,f)$, bag $2$ and $3$ both contain vertices $c$ and $f$. For edge $(i,g)$, bag $5$ contains vertices $i$ and $g$.
  • Figure 4: (a) A $10 \times 10$ positive definite matrix $P = AA^{\top}$, where the blue dot represent the non-zero elements in $P$. (b) The Cholesky factor $L$ of $AA^{\top}$. (c) The corresponding elimination tree for matrix $P$, where each node represent one column in the Cholesky factor. We can see that, as the row index of the first subdiagonal nonzero entry of the $6$-th column is 8, the parent of node $6$ is $8$. Furthermore, the non-zero pattern of this coloumn is $\{6,8,10\}$, which is a subset of vertices on the path from node $6$ to the root in the elimination tree.

Theorems & Definitions (57)

  • Theorem 1.1: Main result I, input-sparsity time
  • Theorem 1.2: Main result II, small treewidth
  • Definition 2.1: Leverage score
  • Lemma 2.2: Folklore
  • Lemma 2.3: Leverage score computation, Lemma 4.3 in dsw22
  • Definition 2.4: Tree decomposition and tree width of a graph bghk95d06lms13
  • Definition 2.5: Dual graph
  • Definition 2.6: Cholesky factorization
  • Lemma 2.7: Fast Cholesky factorization bghk95d06
  • Remark 2.8
  • ...and 47 more