Radial Isotropic Position via an Implicit Newton's Method
Arun Jambulapati, Jonathan Li, Kevin Tian
TL;DR
The paper develops a near-optimal algorithm for computing ε-approximate Forster transforms, enabling radial isotropic positioning of datasets. It combines a Hessian-stable box-constrained Newton method for Barthe's objective with an implicit Laplacian sparsification primitive to achieve near-linear runtimes in well-conditioned regimes, plus a smoothed-analysis guarantee that bounds conditioning under Gaussian perturbations. A key technical contribution is an implicit sparsification method that reduces graph-Laplacian handling to matrix-vector queries, facilitating fast second-order optimization. The results yield improved runtimes over previous CPM-based methods and provide explicit conditioning bounds in the smoothed setting, broadening the practical applicability of Forster transforms in learning, coding, and analysis.
Abstract
Placing a dataset $A = \{\mathbf{a}_i\}_{i \in [n]} \subset \mathbb{R}^d$ in radial isotropic position, i.e., finding an invertible $\mathbf{R} \in \mathbb{R}^{d \times d}$ such that the unit vectors $\{(\mathbf{R} \mathbf{a}_i) \|\mathbf{R} \mathbf{a}_i\|_2^{-1}\}_{i \in [n]}$ are in isotropic position, is a powerful tool with applications in functional analysis, communication complexity, coding theory, and the design of learning algorithms. When the transformed dataset has a second moment matrix within a $\exp(\pm ε)$ factor of a multiple of $\mathbf{I}_d$, we call $\mathbf{R}$ an $ε$-approximate Forster transform. We give a faster algorithm for computing approximate Forster transforms, based on optimizing an objective defined by Barthe [Barthe98]. When the transform has a polynomially-bounded aspect ratio, our algorithm uses $O(nd^{ω- 1}(\frac n ε)^{o(1)})$ time to output an $ε$-approximate Forster transform with high probability, when one exists. This is almost the natural limit of this approach, as even evaluating Barthe's objective takes $O(nd^{ω- 1})$ time. Previously, the state-of-the-art runtime in this regime was based on cutting-plane methods, and scaled at least as $\approx n^3 + n^2 d^{ω- 1}$. We also provide explicit estimates on the aspect ratio in the smoothed analysis setting, and show that our algorithm similarly improves upon those in the literature. To obtain our results, we develop a subroutine of potential broader interest: a reduction from almost-linear time sparsification of graph Laplacians to the ability to support almost-linear time matrix-vector products. We combine this tool with new stability bounds on Barthe's objective to implicitly implement a box-constrained Newton's method [CMTV17, ALOW17].
