Table of Contents
Fetching ...

Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method

Chenyang Li, Zhao Song, Zhaoxing Xu, Junze Yin

TL;DR

This work studies the inverse problem of the leverage score gradient, aiming to recover model parameters from the gradient of leverage-score-based objectives. It introduces an iterative approximate Newton method that uses subsampled leverage score distributions to form an approximate Hessian, achieving near-input-sparsity time per iteration and fast convergence. Under standard Lipschitz and positive-definite assumptions, the method converges to an $\varepsilon$-approximate solution in $T = \log\left(\|x_0 - x^*\|_2 / \varepsilon\right)$ iterations, with a per-iteration cost of $O\big((\mathrm{nnz}(A) + d^{\omega}) \mathrm{poly}(\log(n/\delta))\big)$. This approach blends randomized numerical linear algebra with convex optimization to enable efficient, privacy-aware leverage-score-based learning and analysis.

Abstract

Leverage scores have become essential in statistics and machine learning, aiding regression analysis, randomized matrix computations, and various other tasks. This paper delves into the inverse problem, aiming to recover the intrinsic model parameters given the leverage scores gradient. This endeavor not only enriches the theoretical understanding of models trained with leverage score techniques but also has substantial implications for data privacy and adversarial security. We specifically scrutinize the inversion of the leverage score gradient, denoted as $g(x)$. An innovative iterative algorithm is introduced for the approximate resolution of the regularized least squares problem stated as $\min_{x \in \mathbb{R}^d} 0.5 \|g(x) - c\|_2^2 + 0.5\|\mathrm{diag}(w)Ax\|_2^2$. Our algorithm employs subsampled leverage score distributions to compute an approximate Hessian in each iteration, under standard assumptions, considerably mitigating the time complexity. Given that a total of $T = \log(\| x_0 - x^* \|_2/ ε)$ iterations are required, the cost per iteration is optimized to the order of $O( (\mathrm{nnz}(A) + d^ω ) \cdot \mathrm{poly}(\log(n/δ))$, where $\mathrm{nnz}(A)$ denotes the number of non-zero entries of $A$.

Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method

TL;DR

This work studies the inverse problem of the leverage score gradient, aiming to recover model parameters from the gradient of leverage-score-based objectives. It introduces an iterative approximate Newton method that uses subsampled leverage score distributions to form an approximate Hessian, achieving near-input-sparsity time per iteration and fast convergence. Under standard Lipschitz and positive-definite assumptions, the method converges to an -approximate solution in iterations, with a per-iteration cost of . This approach blends randomized numerical linear algebra with convex optimization to enable efficient, privacy-aware leverage-score-based learning and analysis.

Abstract

Leverage scores have become essential in statistics and machine learning, aiding regression analysis, randomized matrix computations, and various other tasks. This paper delves into the inverse problem, aiming to recover the intrinsic model parameters given the leverage scores gradient. This endeavor not only enriches the theoretical understanding of models trained with leverage score techniques but also has substantial implications for data privacy and adversarial security. We specifically scrutinize the inversion of the leverage score gradient, denoted as . An innovative iterative algorithm is introduced for the approximate resolution of the regularized least squares problem stated as . Our algorithm employs subsampled leverage score distributions to compute an approximate Hessian in each iteration, under standard assumptions, considerably mitigating the time complexity. Given that a total of iterations are required, the cost per iteration is optimized to the order of , where denotes the number of non-zero entries of .
Paper Structure (44 sections, 34 theorems, 172 equations, 1 algorithm)

This paper contains 44 sections, 34 theorems, 172 equations, 1 algorithm.

Key Result

Theorem 1.4

Let $\epsilon, \delta \in (0,0.1)$. Given $A \in \mathbb{R}^{n \times d}$, $b \in \mathbb{R}^n$, $w \in \mathbb{R}^n$, we let $x^*$ be the optimal solution of our regularized least squares problem (see Definition def:our). Let $x_0 \in \mathbb{R}^d$ be close to $x^*$ (see Definition def:f_ass), wher

Theorems & Definitions (83)

  • Definition 1.1: Leverage score
  • Definition 1.2: Leverage score inversion problem lsw+24
  • Definition 1.3: Leverage score gradient inversion problem
  • Theorem 1.4: Informal version of our main result (Theorem \ref{['thm:main_formal']})
  • proof
  • Definition 4.1
  • Lemma 4.2: Informal version of Lemma \ref{['lem:L_c']}
  • Lemma 5.1: Informal version of Lemma \ref{['lem:gradient_Al']}
  • Lemma 5.2: Informal version of Lemma \ref{['lem:L_c_Hessian']}
  • Definition 5.3: Definition 4.8 of dls23
  • ...and 73 more