Dynamic Incremental Optimization for Best Subset Selection
Shaogang Ren, Xiaoning Qian
TL;DR
The paper addresses generalized best subset selection with non-convex \ell_0 regularization by deriving a dual formulation and solving it via a primal–dual algorithm. It introduces dual-range estimation and an active incremental strategy that employs safe screening to prune inactive features and gradually grow the active set, achieving substantial computational savings. The authors establish strong duality, analyze convergence and polynomial-time complexity, and demonstrate competitive or improved efficiency on synthetic linear-regression data and the News20 dataset. The approach yields higher-quality sparse estimators with significantly reduced computational cost in high-dimensional sparse learning tasks.
Abstract
Best subset selection is considered the `gold standard' for many sparse learning problems. A variety of optimization techniques have been proposed to attack this non-smooth non-convex problem. In this paper, we investigate the dual forms of a family of $\ell_0$-regularized problems. An efficient primal-dual algorithm is developed based on the primal and dual problem structures. By leveraging the dual range estimation along with the incremental strategy, our algorithm potentially reduces redundant computation and improves the solutions of best subset selection. Theoretical analysis and experiments on synthetic and real-world datasets validate the efficiency and statistical properties of the proposed solutions.
