Randomized batch-sampling Kaczmarz methods for general linear systems
Dong-Yue Xie, Xi Yang
TL;DR
The paper tackles solving general linear systems $A x=b$ with randomized Kaczmarz-type methods by introducing a unified randomized batch-sampling Kaczmarz (RBSK) framework. It derives concentration-inequality-based, scale-invariant mean-squared convergence bounds that apply to randomized non-extended block Kaczmarz methods with static samplings, and shows how the batch distribution ${\bf P}$ and a diagonal scaling $S$ influence the rate through quantities like $\xi$ and $\eta$. The work demonstrates that the new bounds are tighter and more aligned with empirical convergence than existing ND14 and GM21 bounds, across synthetic multi-scale and ill-conditioned matrices as well as real sparse data from SuiteSparse. It also highlights the potential of learning-guided batch-sampling distributions to tailor performance to specific applications while maintaining low per-iteration cost. Overall, the RBSK framework provides a flexible, theoretically sharp, and practically effective approach for fast randomized block Kaczmarz solvers in large-scale linear problems.
Abstract
To conduct a more in-depth investigation of randomized solvers for general linear systems, we adopt a unified randomized batch-sampling Kaczmarz framework with per-iteration costs as low as cyclic block methods, and develop a general analysis technique to establish its convergence guarantee. With concentration inequalities, we derive new expected linear convergence rate bounds. The analysis applies to any randomized non-extended block Kaczmarz methods with static stochastic samplings. In addition, the new rate bounds are scale-invariant which eliminate the dependence on the magnitude of the data matrix. In most experiments, the new bounds are significantly tighter than existing ones and better reflect the empirical convergence behavior of block methods. Within this new framework, the batch-sampling distribution, as a learnable parameter, provides the possibility for block methods to achieve efficient performance in specific application scenarios, which deserves further investigation.
