Discrepancy Minimization in Input-Sparsity Time

Yichuan Deng; Xiaoyu Li; Zhao Song; Omri Weinstein

Discrepancy Minimization in Input-Sparsity Time

Yichuan Deng, Xiaoyu Li, Zhao Song, Omri Weinstein

TL;DR

This work studies algorithmic discrepancy minimization for real-valued matrices, aiming for input-sparsity time colorings. It introduces a near-optimal combinatorial framework that matches Larsen's approximation while achieving runtimes of $\tilde{O}(\mathrm{nnz}(A) + n^{3})$ and, with fast matrix multiplication, $\tilde{O}(\mathrm{nnz}(A) + n^{2.53})$, by leveraging a new hereditary projection technique and an efficient Edge-Walk implementation. The key innovations are a sketch-based projection that avoids explicitly forming $B_t$, implicit leverage-score sampling to approximate top eigen-directions, and a lazy, batch-oriented data structure for computing projected Gaussian vectors, which together overcome the $O(mn^{2})$ per-step bottlenecks and the cubic barrier in near-square regimes. The results nearly close the gap between real-valued and binary matrices for input-sparsity time coloring and offer a practical combinatorial alternative to SDP-based approaches, with potential impact on discrepancy theory and related optimization problems.

Abstract

A recent work by [Larsen, SODA 2023] introduced a faster combinatorial alternative to Bansal's SDP algorithm for finding a coloring $x \in \{-1, 1\}^n$ that approximately minimizes the discrepancy $\mathrm{disc}(A, x) := | A x |_{\infty}$ of a real-valued $m \times n$ matrix $A$. Larsen's algorithm runs in $\widetilde{O}(mn^2)$ time compared to Bansal's $\widetilde{O}(mn^{4.5})$-time algorithm, with a slightly weaker logarithmic approximation ratio in terms of the hereditary discrepancy of $A$ [Bansal, FOCS 2010]. We present a combinatorial $\widetilde{O}(\mathrm{nnz}(A) + n^3)$-time algorithm with the same approximation guarantee as Larsen's, optimal for tall matrices where $m = \mathrm{poly}(n)$. Using a more intricate analysis and fast matrix multiplication, we further achieve a runtime of $\widetilde{O}(\mathrm{nnz}(A) + n^{2.53})$, breaking the cubic barrier for square matrices and surpassing the limitations of linear-programming approaches [Eldan and Singh, RS&A 2018]. Our algorithm relies on two key ideas: (i) a new sketching technique for finding a projection matrix with a short $\ell_2$-basis using implicit leverage-score sampling, and (ii) a data structure for efficiently implementing the iterative Edge-Walk partial-coloring algorithm [Lovett and Meka, SICOMP 2015], and using an alternative analysis to enable ''lazy'' batch updates with low-rank corrections. Our results nearly close the computational gap between real-valued and binary matrices, for which input-sparsity time coloring was recently obtained by [Jain, Sah and Sawhney, SODA 2023].

Discrepancy Minimization in Input-Sparsity Time

TL;DR

and, with fast matrix multiplication,

, by leveraging a new hereditary projection technique and an efficient Edge-Walk implementation. The key innovations are a sketch-based projection that avoids explicitly forming

, implicit leverage-score sampling to approximate top eigen-directions, and a lazy, batch-oriented data structure for computing projected Gaussian vectors, which together overcome the

per-step bottlenecks and the cubic barrier in near-square regimes. The results nearly close the gap between real-valued and binary matrices for input-sparsity time coloring and offer a practical combinatorial alternative to SDP-based approaches, with potential impact on discrepancy theory and related optimization problems.

Abstract

A recent work by [Larsen, SODA 2023] introduced a faster combinatorial alternative to Bansal's SDP algorithm for finding a coloring

that approximately minimizes the discrepancy

of a real-valued

matrix

. Larsen's algorithm runs in

time compared to Bansal's

-time algorithm, with a slightly weaker logarithmic approximation ratio in terms of the hereditary discrepancy of

[Bansal, FOCS 2010]. We present a combinatorial

-time algorithm with the same approximation guarantee as Larsen's, optimal for tall matrices where

. Using a more intricate analysis and fast matrix multiplication, we further achieve a runtime of

, breaking the cubic barrier for square matrices and surpassing the limitations of linear-programming approaches [Eldan and Singh, RS&A 2018]. Our algorithm relies on two key ideas: (i) a new sketching technique for finding a projection matrix with a short

-basis using implicit leverage-score sampling, and (ii) a data structure for efficiently implementing the iterative Edge-Walk partial-coloring algorithm [Lovett and Meka, SICOMP 2015], and using an alternative analysis to enable ''lazy'' batch updates with low-rank corrections. Our results nearly close the computational gap between real-valued and binary matrices, for which input-sparsity time coloring was recently obtained by [Jain, Sah and Sawhney, SODA 2023].

Paper Structure (81 sections, 34 theorems, 214 equations, 7 figures, 4 tables, 7 algorithms)

This paper contains 81 sections, 34 theorems, 214 equations, 7 figures, 4 tables, 7 algorithms.

Introduction
Our Results
Related Work
Algorithmic discrepancy theory.
Sketching and Leverage score sampling.
Technical Overview
Overview and Barriers of Larsen's Algorithm
Our Techniques
Robust Analysis of Larsen's Algorithm
Approximate norm-estimation suffices.
Approximate SVD suffices.
Overcoming the Barriers
Speeding-up the "hereditary-projection" step
Implicit Leverage-Score Sampling
Faster Iterative Coloring
...and 66 more sections

Key Result

Theorem 1.3

For any parameter $a \in [0,1]$, there is a randomized algorithm that, given a real matrix $A \in \mathbb{R}^{m \times n}$, finds a coloring $x \in \{-1,+1\}^n$ such that Moreover, it runs in time Here, $\omega(a, b, c)$ denotes the time for multiplying an $n^a \times n^b$ matrix with an $n^b \times n^c$ matrix, and $\omega := \omega(1,1,1)$ denotes the exponent of fast matrix multiplication (FM

Figures (7)

Figure 1: We use the above sketch technique to reduce the time cost when computing the row norms. $\widehat{B} \in \mathbb{R}^{m \times \widetilde{O}(\epsilon_0^{-2})}$ is our sketched matrix. $A \in \mathbb{R}^{m \times n}$ is the original data matrix. $I \in \mathbb{R}^{n \times n}$ is an $n \times n$ unit matrix. $V^\top V \in \mathbb{R}^{n \times n}$ is the projection matrix onto the row span of $V$. And $R \in \mathbb{R}^{n \times \widetilde{O}(\epsilon_0^{-2})}$ is our JL sketching matrix. After sketched, we are able to fast query row norms of $\widehat{B}$, which are close to row norm of $B$ with an accuracy $\epsilon_0$. We select the rows from $B$ using this approximated norms. For the details of the selection operation, see Figure \ref{['fig:proj_subsample']} and Figure \ref{['fig:proj_row_select']}.
Figure 2: This figure shows the difference of the row selection. Figure (a): Larsen's algorithm explicitly selected the $m_t$ largest rows from $B_t = A(I - V_t^\top V_t)$, and compute the eigenvectors of $\overline{B}_t^\top \overline{B}_t$, whose time cost is expensive. Figure (b): In our design, we use a subsample matrix $\widetilde{D}_t$ to generate the matrix $\widetilde{B}_t$, who has the eigenvalues close to $\overline{B}_t$. We use the above subsamping technique to reduce the time cost when computing the SVD decomposition. By using this, we can fast compute the eigenvectors.
Figure 3: This figure shows the idea of the operation at the $t$-th iteration in the projection algorithm. The figure (a) shows the idea of Larsen's, and the figure (b) shows the idea of ours. Figure (a): Larsen's algorithm selects the $m_t$ largest rows from the matrix $B_{t} = A(I - V_{t-1}^\top V_{t-1})$, then the rest of the rows not selected will have norm less than the threshold $C_0 \cdot T \cdot \mathrm{herdisc}(A)$. But the time cost of row norm computation is expensive. Figure (b): we compute the sketched matrix $\widetilde{B}_{t} = A(I - V_{t-1}^\top V_{t-1})R$, where $R$ is an JL matrix. Then we select the rows based on the approximated norms. Thus we significantly reduce the time cost of norm computation. We show that, under our setting, the norm of the rows not selected will have another guarantee, that is, $(1 + \epsilon_0) \cdot C_0 \cdot T \cdot \mathrm{herdisc}(A)$. This constant loss will still make our algorithm correct. (Since our row norm computation is approximated, there is a constant loss effecting the selecting operation. To demonstrate this, we use the darker red to demonstrate the real largest rows without being selected.)
Figure 4: The flowcharts show different design of the partial coloring algorithm of l22 and ours. In figure (a), the blue blocks are the steps causing the mainly time cost. In figure (b), the red blocks are the steps we design to overcome those time costs. The $\mathsf{g} \in \mathbb{R}^n$ is the new-generated Gaussian vector and $g$ is the projected vector. That is, $g := (I - V^\top V)\mathsf{g}$.
Figure 5: The decomposition of the output vector by Query. The green composition is the pre-computed factor and the blue composition is the new added ones, these two together form the projection of $g$ onto the row span of $V$.
...and 2 more figures

Theorems & Definitions (89)

Definition 1.1: Discrepancy
Definition 1.2: Hereditary discrepancy
Theorem 1.3: Main result, informal version of Theorem \ref{['thm:hereditary_minimize_correct']} and Theorem \ref{['thm:hereditary_minimize_time']}
Remark 1.4
Theorem 1.5: Fast hereditary projection, informal version of Theorem \ref{['thm:time_proj']} and Theorem \ref{['thm:correct_proj']}
Definition A.1: Discrepancy
Definition A.2: Hereditary Discrepancy
Lemma A.3
Theorem A.4: l17
Claim A.5
...and 79 more

Discrepancy Minimization in Input-Sparsity Time

TL;DR

Abstract

Discrepancy Minimization in Input-Sparsity Time

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (89)