Stochastic Variance-Reduced Iterative Hard Thresholding in Graph Sparsity Optimization
Derek Fox, Samuel Hernandez, Qianqian Tong
TL;DR
The paper tackles graph-structured sparsity optimization under large-scale data where stochastic gradients suffer from variance. It introduces two stochastic variance-reduced gradient methods, GraphSVRG-IHT and GraphSCSG-IHT, that incorporate head and tail projections to enforce graph-structured sparsity and leverage variance-reduction techniques for non-convex objectives. The authors provide a general theoretical framework proving linear convergence with a constant learning rate, and validate the approach experimentally on synthetic data and a real breast cancer gene dataset, showing improved convergence and better gene selection performance. The work advances efficient, scalable sparsity-constrained learning in graph-structured domains with potential impact on disease monitoring and network analysis, and lays groundwork for applying these methods to larger real-world datasets.
Abstract
Stochastic optimization algorithms are widely used for large-scale data analysis due to their low per-iteration costs, but they often suffer from slow asymptotic convergence caused by inherent variance. Variance-reduced techniques have been therefore used to address this issue in structured sparse models utilizing sparsity-inducing norms or $\ell_0$-norms. However, these techniques are not directly applicable to complex (non-convex) graph sparsity models, which are essential in applications like disease outbreak monitoring and social network analysis. In this paper, we introduce two stochastic variance-reduced gradient-based methods to solve graph sparsity optimization: GraphSVRG-IHT and GraphSCSG-IHT. We provide a general framework for theoretical analysis, demonstrating that our methods enjoy a linear convergence speed. Extensive experiments validate
