Table of Contents
Fetching ...

The Space Complexity of Approximating Logistic Loss

Gregory Dexter, Petros Drineas, Rajiv Khanna

TL;DR

The paper addresses the space efficiency of data structures that approximate the logistic loss in logistic regression, introducing the dataset-dependent measure $μ_{oldsymbol{y}}(\mathbf{X})$ to capture compressibility. It proves strong lower bounds: a $\tilde{Ω}\left(\frac{d}{ε^2}\right)$ space requirement when $μ_{oldsymbol{y}}(\mathbf{X})=Θ(1)$ and a general $\tilde{Ω}\left(d\cdot μ_{oldsymbol{y}}(\mathbf{X})\right)$ bound for constant $ε$, demonstrating that the dependence on $μ_{oldsymbol{y}}(\mathbf{X})$ is intrinsic rather than an artifact of specific coreset constructions. The work also provides a polynomial-time linear-programming approach to compute $μ_{oldsymbol{y}}(\mathbf{X})$, refutes prior conjectures about its hardness, and includes empirical comparisons to prior methods. Overall, the results imply that existing coreset bounds are near-optimal in the typical $μ$-bounded regime while revealing fundamental limits to compressing logistic regression data beyond these bounds.

Abstract

We provide space complexity lower bounds for data structures that approximate logistic loss up to $ε$-relative error on a logistic regression problem with data $\mathbf{X} \in \mathbb{R}^{n \times d}$ and labels $\mathbf{y} \in \{-1,1\}^d$. The space complexity of existing coreset constructions depend on a natural complexity measure $μ_\mathbf{y}(\mathbf{X})$, first defined in (Munteanu, 2018). We give an $\tildeΩ(\frac{d}{ε^2})$ space complexity lower bound in the regime $μ_\mathbf{y}(\mathbf{X}) = O(1)$ that shows existing coresets are optimal in this regime up to lower order factors. We also prove a general $\tildeΩ(d\cdot μ_\mathbf{y}(\mathbf{X}))$ space lower bound when $ε$ is constant, showing that the dependency on $μ_\mathbf{y}(\mathbf{X})$ is not an artifact of mergeable coresets. Finally, we refute a prior conjecture that $μ_\mathbf{y}(\mathbf{X})$ is hard to compute by providing an efficient linear programming formulation, and we empirically compare our algorithm to prior approximate methods.

The Space Complexity of Approximating Logistic Loss

TL;DR

The paper addresses the space efficiency of data structures that approximate the logistic loss in logistic regression, introducing the dataset-dependent measure to capture compressibility. It proves strong lower bounds: a space requirement when and a general bound for constant , demonstrating that the dependence on is intrinsic rather than an artifact of specific coreset constructions. The work also provides a polynomial-time linear-programming approach to compute , refutes prior conjectures about its hardness, and includes empirical comparisons to prior methods. Overall, the results imply that existing coreset bounds are near-optimal in the typical -bounded regime while revealing fundamental limits to compressing logistic regression data beyond these bounds.

Abstract

We provide space complexity lower bounds for data structures that approximate logistic loss up to -relative error on a logistic regression problem with data and labels . The space complexity of existing coreset constructions depend on a natural complexity measure , first defined in (Munteanu, 2018). We give an space complexity lower bound in the regime that shows existing coresets are optimal in this regime up to lower order factors. We also prove a general space lower bound when is constant, showing that the dependency on is not an artifact of mergeable coresets. Finally, we refute a prior conjecture that is hard to compute by providing an efficient linear programming formulation, and we empirically compare our algorithm to prior approximate methods.

Paper Structure

This paper contains 16 sections, 12 theorems, 52 equations, 1 figure.

Key Result

Lemma 2.1

Given a data set $\mathbf{X} \in \mathbb{R}^{n \times d}$ and $\mathcal{B} \subset \mathbb{R}^d$ such that $\inf_{\beta\in \mathcal{B}} \mathcal{R}(\beta; \mathbf{X}) > 1$, if there exists a data structure $\tilde{\mathcal{L}}(\cdot)$ that satisfies: then there exists a data structure taking $\tilde{\mathcal{O}}(1)$ extra space such that:

Figures (1)

  • Figure 1: Simulated data results for exact computation of $\mu_\mathbf{y}(\mathbf{X})$ (Theorem \ref{['thm:compute_complexity_measure']}) using the full data (Exactfull), sketched data (ExactSketched) vs the approximate upper (ApprSketchedUpper) and lower bounds (ApprSketchedLower) as suggested by munteanu2018coresets (see Section \ref{['sec:appendixExperiments']}). The results clearly show that the upper and lower bounds can be very loose compared to our exact calculation of the complexity measure $\mu_\mathbf{y}(\mathbf{X})$

Theorems & Definitions (14)

  • Definition 1
  • Lemma 2.1
  • Lemma 2.2
  • Theorem 1
  • Corollary 2
  • Lemma 2.3
  • Lemma 2.4
  • Theorem 3
  • Theorem 4
  • Theorem 5: Theorem 2 in Hoeff63
  • ...and 4 more