A Polynomial-time Algorithm for Online Sparse Linear Regression with Improved Regret Bound under Weaker Conditions
Junfan Li, Shizhong Liao, Zenglin Xu, Liqiang Nie
TL;DR
This work tackles online sparse linear regression when predictions are restricted to observing only a subset of features per instance. By employing a Dantzig Selector-based approach with an algorithm-dependent sampling scheme, adaptive parameter tuning, and a batching online Newton step, the authors achieve significantly improved regret under the weaker compatibility condition, compared to prior methods that required RIP or feature-independence. They establish tight $\ell_1$-norm error bounds for the estimators and decompose regret to tightly control constants and dependencies on $d$, $k$, and $T$, including an extension to the $(k,k_0,d)$ setting with additional observations. The results demonstrate that polynomial-time OSLR is achievable with strong performance guarantees under milder assumptions, advancing both theory and potential practice in resource-constrained online prediction. The work also outlines directions for reducing reliance on linear programs, tightening lower bounds, and broadening the observation model.
Abstract
In this paper, we study the problem of online sparse linear regression (OSLR) where the algorithms are restricted to accessing only $k$ out of $d$ attributes per instance for prediction, which was proved to be NP-hard. Previous work gave polynomial-time algorithms assuming the data matrix satisfies the linear independence of features, the compatibility condition, or the restricted isometry property. We introduce a new polynomial-time algorithm, which significantly improves previous regret bounds (Ito et al., 2017) under the compatibility condition that is weaker than the other two assumptions. The improvements benefit from a tighter convergence rate of the $\ell_1$-norm error of our estimators. Our algorithm leverages the well-studied Dantzig Selector, but importantly with several novel techniques, including an algorithm-dependent sampling scheme for estimating the covariance matrix, an adaptive parameter tuning scheme, and a batching online Newton step with careful initializations. We also give novel and non-trivial analyses, including an induction method for analyzing the $\ell_1$-norm error, careful analyses on the covariance of non-independent random variables, and a decomposition on the regret. We further extend our algorithm to OSLR with additional observations where the algorithms can observe additional $k_0$ attributes after each prediction, and improve previous regret bounds (Kale et al., 2017; Ito et al., 2017).
