Sparse Polyak with optimal thresholding operators for high-dimensional M-estimation
Tianqi Qiao, Marie Maros
TL;DR
The paper tackles high-dimensional M-estimation under sparsity by generalizing Sparse Polyak to use adaptive step-sizes with a broad class of sparsifying operators. By selecting operators with bounded relative concavity $\eta_{s^*}(\Phi_s)$, the method achieves contractive convergence and preserves dimension-invariant rates; in particular, employing Reciprocal Thresholding (RT) reduces the required sparsity from $s=O(s^*\bar{\kappa}^2)$ to $s=O(s^*\bar{\kappa})$ and improves final accuracy by a factor of $\bar{\kappa}$. Theoretical results establish a contractive bound $\|\theta_{t+1}-\widehat{\theta}\|^2 \le (1-1/(40\bar{\kappa})+4\eta_{s^*}(\Phi_s))\|\theta_t-\widehat{\theta}\|^2$, with corollaries for sparse linear and GLMs showing near-optimal statistical precision independent of $d$. Numerical experiments on sparse logistic regression demonstrate faster convergence and sparser solutions with RT, validating the approach's scalability to very high-dimensional problems in GLMs.
Abstract
We propose and analyze a variant of Sparse Polyak for high dimensional M-estimation problems. Sparse Polyak proposes a novel adaptive step-size rule tailored to suitably estimate the problem's curvature in the high-dimensional setting, guaranteeing that the algorithm's performance does not deteriorate when the ambient dimension increases. However, convergence guarantees can only be obtained by sacrificing solution sparsity and statistical accuracy. In this work, we introduce a variant of Sparse Polyak that retains its desirable scaling properties with respect to the ambient dimension while obtaining sparser and more accurate solutions.
